reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saikat Kanjilal <sxk1...@gmail.com>
Subject Re: Design Considerations on reef-1791
Date Wed, 14 Jun 2017 01:42:28 GMT
:)))))), Hi John,
 I am not working on the mesons runtime but rather using it as a template for building the
reef runtime on spark, please read my email carefully below :) and let me know your thoughts
on extending parts of this runtime to the reef runtime spark architecture.
Regards

Sent from my iPad

> On Jun 13, 2017, at 6:19 PM, John Yang <johnyangk@gmail.com> wrote:
> 
> Hi Saikat,
> 
> 
> Many thanks for working on the mesos runtime!
> I can answer 4): Yes, we can do without the extra remote managers, but with
> some caveats.
> 
> By default, Mesos employs pessimistic concurrency control
> <https://research.google.com/pubs/pub41684.html> in giving out resource
> offers.
> So from our(REEF) perspective, once we get a resource offer from Mesos, I
> believe the offer is pretty much for us to keep without any other job
> taking it away from us.
> With this in mind, the mesos runtime can do the following, which doesn't
> really require any extra RemoteManagers.
> 
>   - Upon start: Be a good citizen and reject any incoming offers, since we
>   don't need any resources yet
>   - Upon resource request: Keep an appropriate offer
>   - Upon resource launch: Simply launch a REEF evaluator with the offer
> 
> Let's call this Design A
> 
> However, the current mesos runtime implementation(let's call it Design B)
> does not work like Design A.
> The main reason is that custom allocators
> <http://mesos.apache.org/documentation/latest/allocation-module/#writing-a-custom-allocator>
> that
> make offers to multiple jobs simultaneously can be used in Mesos.
> So to make sure, Design B launches a Mesos task upon resource request, and
> the task sets up a RemoteManager channel through which the REEF evaluator
> is launched.
> 
> I must admit that had I known more about the pessimistic locking 3 years
> ago when I wrote the mesos runtime, I would've thought about going with
> Design A, which covers the common case much more nicely.
> And then, I would've handled the behaviors of custom allocators as
> exceptional cases through implementing the Scheduler#offerRescinded
> callback, although I'm still not sure if it's straightforward to do so with
> REEF.
> 
> All in all, I believe the mesos runtime hasn't really been maintained since
> it was first written, and has bits that need to be refactored.
> For example, I see that we're still using Mesos 0.25.0, when 1.2.0
> <http://mesos.apache.org/> has been released.
> 
> Hope this helps.
> 
> 
> Thanks,
> John
> 
> 
>> On Wed, Jun 14, 2017 at 8:17 AM, Saikat Kanjilal <sxk1969@gmail.com> wrote:
>> 
>> @Markus/Sergiy,
>> I've spent the past few days or so studying the implementation of the
>> reef-runtime-mesos and had some things I wanted to discuss, as I mentioned
>> before I created reef-runtime-spark as a clone of the mesos runtime as a
>> first step.  However the more I look at the code and try to figure out how
>> to merge
>> https://github.com/apache/reef/tree/master/lang/scala/
>> reef-examples-scala/src/main/scala/org/apache/reef/examples/hellospark
>> into reef-runtime-spark there are several things that come to mind needing
>> further discussion:
>> 
>> 1) the mesos runtime is currently using google protcol buffer and the mesos
>> task API, am assuming we don't need any of this for the spark runtime or
>> any of the interfaces with avro, is that assumption correct
>> 2) I see a lot of classes in the org.apache.reef.runtime.mesos.driver
>> package associated with Launching, Releasing,Requesting Resources, in the
>> interim I renamed all these to Spark versions and am assuming we can still
>> reuse these, do you see any issues with this, if we can reuse these they
>> will be available through the SparkDriverConfiguration which extends
>> ConfigurationModuleBuilder (again similar to Mesos implementation)
>> 3) I also renamed all of the mesos evaluator packages to their spark
>> counterparts, do you see any issues with reusing the evaluator parameters
>> classes
>> 4) Finally I am looking at the mesos util directory and I am wondering if
>> we can do without any of the Remote management functionality (i..e
>> MesosRemoteManager etc)
>> 
>> 
>> Would love some input on this as I piece through the first implementation
>> of the reef-runtime-spark.
>> Regards
>> 

Mime
View raw message