reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Yang <johnya...@gmail.com>
Subject Re: Design Considerations on reef-1791
Date Wed, 14 Jun 2017 01:55:39 GMT
Hi Saikat,

Thanks for the clarification. I'm not familiar with the reef-runtime-spark
project, so I'm not sure I can answer your questions in that regard.
You're welcome to ask questions about developing the mesos runtime itself,
if that becomes your interest. :)

Thanks!
John


On Wed, Jun 14, 2017 at 10:42 AM, Saikat Kanjilal <sxk1969@gmail.com> wrote:

> :)))))), Hi John,
>  I am not working on the mesons runtime but rather using it as a template
> for building the reef runtime on spark, please read my email carefully
> below :) and let me know your thoughts on extending parts of this runtime
> to the reef runtime spark architecture.
> Regards
>
> Sent from my iPad
>
> > On Jun 13, 2017, at 6:19 PM, John Yang <johnyangk@gmail.com> wrote:
> >
> > Hi Saikat,
> >
> >
> > Many thanks for working on the mesos runtime!
> > I can answer 4): Yes, we can do without the extra remote managers, but
> with
> > some caveats.
> >
> > By default, Mesos employs pessimistic concurrency control
> > <https://research.google.com/pubs/pub41684.html> in giving out resource
> > offers.
> > So from our(REEF) perspective, once we get a resource offer from Mesos, I
> > believe the offer is pretty much for us to keep without any other job
> > taking it away from us.
> > With this in mind, the mesos runtime can do the following, which doesn't
> > really require any extra RemoteManagers.
> >
> >   - Upon start: Be a good citizen and reject any incoming offers, since
> we
> >   don't need any resources yet
> >   - Upon resource request: Keep an appropriate offer
> >   - Upon resource launch: Simply launch a REEF evaluator with the offer
> >
> > Let's call this Design A
> >
> > However, the current mesos runtime implementation(let's call it Design B)
> > does not work like Design A.
> > The main reason is that custom allocators
> > <http://mesos.apache.org/documentation/latest/
> allocation-module/#writing-a-custom-allocator>
> > that
> > make offers to multiple jobs simultaneously can be used in Mesos.
> > So to make sure, Design B launches a Mesos task upon resource request,
> and
> > the task sets up a RemoteManager channel through which the REEF evaluator
> > is launched.
> >
> > I must admit that had I known more about the pessimistic locking 3 years
> > ago when I wrote the mesos runtime, I would've thought about going with
> > Design A, which covers the common case much more nicely.
> > And then, I would've handled the behaviors of custom allocators as
> > exceptional cases through implementing the Scheduler#offerRescinded
> > callback, although I'm still not sure if it's straightforward to do so
> with
> > REEF.
> >
> > All in all, I believe the mesos runtime hasn't really been maintained
> since
> > it was first written, and has bits that need to be refactored.
> > For example, I see that we're still using Mesos 0.25.0, when 1.2.0
> > <http://mesos.apache.org/> has been released.
> >
> > Hope this helps.
> >
> >
> > Thanks,
> > John
> >
> >
> >> On Wed, Jun 14, 2017 at 8:17 AM, Saikat Kanjilal <sxk1969@gmail.com>
> wrote:
> >>
> >> @Markus/Sergiy,
> >> I've spent the past few days or so studying the implementation of the
> >> reef-runtime-mesos and had some things I wanted to discuss, as I
> mentioned
> >> before I created reef-runtime-spark as a clone of the mesos runtime as a
> >> first step.  However the more I look at the code and try to figure out
> how
> >> to merge
> >> https://github.com/apache/reef/tree/master/lang/scala/
> >> reef-examples-scala/src/main/scala/org/apache/reef/examples/hellospark
> >> into reef-runtime-spark there are several things that come to mind
> needing
> >> further discussion:
> >>
> >> 1) the mesos runtime is currently using google protcol buffer and the
> mesos
> >> task API, am assuming we don't need any of this for the spark runtime or
> >> any of the interfaces with avro, is that assumption correct
> >> 2) I see a lot of classes in the org.apache.reef.runtime.mesos.driver
> >> package associated with Launching, Releasing,Requesting Resources, in
> the
> >> interim I renamed all these to Spark versions and am assuming we can
> still
> >> reuse these, do you see any issues with this, if we can reuse these they
> >> will be available through the SparkDriverConfiguration which extends
> >> ConfigurationModuleBuilder (again similar to Mesos implementation)
> >> 3) I also renamed all of the mesos evaluator packages to their spark
> >> counterparts, do you see any issues with reusing the evaluator
> parameters
> >> classes
> >> 4) Finally I am looking at the mesos util directory and I am wondering
> if
> >> we can do without any of the Remote management functionality (i..e
> >> MesosRemoteManager etc)
> >>
> >>
> >> Would love some input on this as I piece through the first
> implementation
> >> of the reef-runtime-spark.
> >> Regards
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message