reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Weimer <>
Subject Re: question on proper ConfigurationModule for configuring YARN runtime
Date Fri, 18 Dec 2015 18:14:04 GMT
On 2015-12-16 20:20, Tobin Baker wrote:
> driver.YarnDriverConfiguration is only intended for internal use, 
> while both YarnClientConfiguration and client.YarnDriverConfiguration
> are intended for use by the application


> I'm not entirely clear when an application would e.g.  specify 
> YarnClientConfiguration.YARN_QUEUE_NAME vs. 
> client.YarnDriverConfiguration.QUEUE.

YarnClientConfiguration.YARN_QUEUE_NAME is the default for when an app
doesn't specify client.YarnDriverConfiguration.QUEUE. The idea is that a
single `REEF` instance can be used to submit many Drivers, not all of
which go to the same queue.

> Also, I can't see anywhere that client.YarnDriverConfiguration.CONF 
> is used, so I'm a bit confused. Is the client expected to manually 
> merge the configuration built from 
> client.YarnDriverConfiguration.CONF into the Driver configuration 
> that they pass to REEF.submit()

Yes. We needed an escape hatch from our abstractions: When the Client
*knows* it submits to YARN, it can use that class to specify additional
parameters only understood by YARN. I concede it isn't the most elegant
approach to do this.

It stems from the early decision that a Driver submission shall be a
`Configuration` and nothing else. I now think that has been a mistake
and we've designed it differently in C#, where the `DriverSubmission`
contains the Driver `Configuration`, but also other parameters (like the
submission queue).

> If this is the right interpretation, then it seems like I should be 
> putting the JOB_SUBMISSION_DIRECTORY_PREFIX parameter into 
> client.YarnDriverConfiguration.

Yes, that seems consistent with the rest of the parameters


View raw message