hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Abdelnur <t...@cloudera.com>
Subject Re: Llama - Low Latency Application MAster
Date Fri, 27 Sep 2013 19:03:09 GMT
Steve,

First of all, thanks for taking the time to dig into the docs/code.

Sandy's followed/answered many of your comments, augmenting on that.

> ... unduly negative about the alternate "Model a Service as a Yarn
Application"

It is was not the intention to come across negative about it. We considered
it, but at the moment is not an viable option. Still, even if it we could
go this route, there is an impedance we could not sort out in this model.
First we cannot resize the resources of a container. Second, we cannot
leverage cgroups cpu-controller as a way to enforce utilizaiton for
multiple queries running in the same container.

> .. but not actually running anything other than sleep in the NM process,
but instead effectively block-booking capacity on that box for other work.
Yes?

Correct. In our case, Impala attaches the thread running a query to the
cgroup of the container. This ensures enforcement and accounting.

> .. failure handling strategy going to be?

Llama uses unmanaged AMs. If the AMs go and comeback before the RM
didnt-hear-anything-from-you timeout we are good. The RM is not aware of
Llama failures. Llama must implement its own failover logic. We are
planning to use HDFS to keep Llama 'editlogs'.

> .. why didn't you make YarnRMLlamaAMConnector extend CompositeService

I guess we will convert things into it. At the moment the lifecycle we need
is very basic.

> .. Thrift RPC. Why the choice of that?

Hadoop IPC is not available in C++. Impala is C++. Llama uses Thrift as its
public interface for Impala.

Cheers.


On Fri, Sep 27, 2013 at 10:28 AM, Steve Loughran <stevel@hortonworks.com>wrote:

> On 27 September 2013 16:32, Sandy Ryza <sandy.ryza@cloudera.com> wrote:
>
> > Thanks for taking a look Steve.  Some responses inline.
> >
> >
> > On Fri, Sep 27, 2013 at 3:30 AM, Steve Loughran <stevel@hortonworks.com
> > >wrote:
> >
> > > On 27 September 2013 00:48, Alejandro Abdelnur <tucu@cloudera.com>
> > wrote:
> > >
> > > > Earlier this week I've posted the following comment for tomorrow's
> Yarn
> >
> > >
> > > I think it's an interesting strategy, even if the summary doc is a bit
> > > unduly negative about the alternate "Model a Service as a Yarn
> > Application"
> > > strategy, which, as you will be aware means that YARN can actually
> > enforce
> > > cgroup throttling of a service -and the YARN-896 stuff matters for all
> > > long-lifespan apps in YARN, Llama included.
> > >
> > > We think important work is going on with YARN-896, some of which will
> > definitely benefit Llama.  We just think that, because of their different
> > needs, the model used for frameworks like Storm and Hoya isn't a good fit
> > for frameworks like Impala.
> >
>
> In the list of needs for something to work with Hoya, everything in the
> MUST/MUST NOT categories pretty much applies to everything that YARN can
> work with:
> https://github.com/hortonworks/hoya/blob/master/src/site/md/app_needs.md
>
> dynamically installed apps that use the DFS for all storage, can get killed
> on a whim and use dynamic binding mechanisms to locate peers, rather than
> have everything predefined in config files
>
>
> >
> >
> > > A quick look at the code hints to me that what you are doing in the AM
> is
> > > asking YARN for resources on a machine, but not actually running
> anything
> > > other than sleep in the NM process, but instead effectively
> block-booking
> > > capacity on that box for other work. Yes?
> > >
> > > That's right.
> >
> >
> > > If so its a bit like something that someone I know had MRv1 co-existing
> > > with another grid scheduler -when the other scheduler (which nominally
> > > owned the boxes) ran more code, the #of slots reported by the TT was
> > > reduced. It worked -ish, but without two-way comms it was limited. It
> > > sounds like the NM hookup is to let the existing "legacy" scheduler
> know
> > > that there's capacity it can currently use, with that scheduler asking
> > YARN
> > >  nicely for the resources, rather than just take them and let the TT
> sort
> > > it out for itself.
> > >
> > > To us, the bit about asking the the scheduler nicely for resources is a
> > pretty big difference.  Impala asks for resources in the same way as
> > MapReduce, allowing a central YARN scheduler to have the final say and
> the
> > user to think in terms of queues instead of frameworks.  Asking for
> > resources based on which replicas have capacity is just an optimization
> > motivated by Impala's need for strict locality.
> >
>
> I think once we add a "long-lived" bit to an App request, you could start
> to think about schedulers making different placement decisions knowing that
> the resource will be retained for a while. Examples: hold back a bit longer
> before downgrading locality, on the basis that if a service is there for
> some weeks, locality really matters.
>
>
>
> >
> >
> > > What's your failure handling strategy going to be? Without YARN-1041
> when
> > > the AM rolls over, it loses all its containers. Is your NM plugin set
> to
> > > pick that up and tell Impala it can't have them any more?
> > >
> > > Right.  As soon as the NM kills an Impala container the NM plugin
> passes
> > that on to Impala, which releases the relevant resources.
> >
> >
> I see. Without that AM failure would leak unknown resources, which would be
> a disaster.
>
> One more code question: Thrift RPC. Why the choice of that? I'm curious
> because there is a bias in the Hadoop stack to Hadoop IPC and now
> Hadoop+protobuf, but  you've chosen a different path. Why? Strengths?
> Weaknesses?
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Alejandro

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message