Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Config class to make configuration understandable #914

Merged
merged 2 commits into from
Jun 26, 2014

Conversation

johnynek
Copy link
Collaborator

This is an attempt to unify the jank in summingbird-scalding, internal to Twitter, and in the Job code.

Next up, a clean way to start and run jobs without a Job class.

* Add this class name and the md5 hash of it into the config
*/
def setScaldingFlowClass(clazz: Class[_]): Config =
this.+(ScaldingFlowClassName -> clazz.getName).+(ScaldingFlowClassSignature -> Config.md5Identifier(clazz))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you use .+ here and space everywhere else? Not a big deal, just curious.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow the riform was wrapping it, and breaking semicolon inference. :/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, any idea if it was only 2.9 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ianoc no, I didn't try with 2.10.

It was wrapping it like:

this + x
+ y

rather than

this + y
  + x

Maybe neither would work, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, funky. cool

@jcoveney
Copy link
Contributor

looks good, very minor questions. both tests failed, which is odd, but it looks like both are to travis... let's see!

@johnynek
Copy link
Collaborator Author

Actually, I think it was a real break. I think somehow I broke serialization. Maybe I was wrong.

@johnynek
Copy link
Collaborator Author

Think that got it. @jcoveney another look?

jcoveney added a commit that referenced this pull request Jun 26, 2014
Add a Config class to make configuration understandable
@jcoveney jcoveney merged commit 9f35823 into develop Jun 26, 2014
@jcoveney jcoveney deleted the oscar-config-clean branch June 26, 2014 07:02
* with a class to serialize to bootstrap the process:
* Left((classOf[serialization.KryoHadoop], myInstance))
*/
def setSerialization(kryo: Either[(Class[_ <: KryoInstantiator], KryoInstantiator), Class[_ <: KryoInstantiator]],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these built up just once, or would it make sense that we expand this class so we can pass the 4 ser types around and then just reify into the map at a final toMap step?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should only be called once (it overwrites the keys it touches). Generally, users would not call this (even the Config.default sets up something reasonable).

I thought about that approach too, which I like more in many ways. A better approach might be to build up a Function from Config => Config. Then, before we run, we put in the external configuration (from the JobConf, or Args, or whatever), and the function applies to that.

This just seemed simpler at the time. Feel free to improve it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is moving to a Config => Config model you think we should note as a 1.0 issue or just a nice to have we can more or less skip?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can skip it for now. If we want more insurance, we can make Config a trait, and use a private implementation rather than the case class. That might give us more flexibility later.

@ianoc
Copy link
Collaborator

ianoc commented Jun 26, 2014

Btw great work, nice big step towards the 1.0 -- hopefully with this and your other PR we can drop some code from SB too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants