-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Config class to make configuration understandable #914
Conversation
* Add this class name and the md5 hash of it into the config | ||
*/ | ||
def setScaldingFlowClass(clazz: Class[_]): Config = | ||
this.+(ScaldingFlowClassName -> clazz.getName).+(ScaldingFlowClassSignature -> Config.md5Identifier(clazz)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you use .+ here and space everywhere else? Not a big deal, just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somehow the riform was wrapping it, and breaking semicolon inference. :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, any idea if it was only 2.9 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianoc no, I didn't try with 2.10.
It was wrapping it like:
this + x
+ y
rather than
this + y
+ x
Maybe neither would work, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, funky. cool
looks good, very minor questions. both tests failed, which is odd, but it looks like both are to travis... let's see! |
Actually, I think it was a real break. I think somehow I broke serialization. Maybe I was wrong. |
Think that got it. @jcoveney another look? |
Add a Config class to make configuration understandable
* with a class to serialize to bootstrap the process: | ||
* Left((classOf[serialization.KryoHadoop], myInstance)) | ||
*/ | ||
def setSerialization(kryo: Either[(Class[_ <: KryoInstantiator], KryoInstantiator), Class[_ <: KryoInstantiator]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these built up just once, or would it make sense that we expand this class so we can pass the 4 ser types around and then just reify into the map at a final toMap step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should only be called once (it overwrites the keys it touches). Generally, users would not call this (even the Config.default sets up something reasonable).
I thought about that approach too, which I like more in many ways. A better approach might be to build up a Function from Config => Config
. Then, before we run, we put in the external configuration (from the JobConf, or Args, or whatever), and the function applies to that.
This just seemed simpler at the time. Feel free to improve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is moving to a Config => Config model you think we should note as a 1.0 issue or just a nice to have we can more or less skip?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can skip it for now. If we want more insurance, we can make Config a trait, and use a private implementation rather than the case class. That might give us more flexibility later.
Btw great work, nice big step towards the 1.0 -- hopefully with this and your other PR we can drop some code from SB too. |
This is an attempt to unify the jank in summingbird-scalding, internal to Twitter, and in the Job code.
Next up, a clean way to start and run jobs without a Job class.