drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <j...@omernik.com>
Subject Re: Drill on YARN
Date Thu, 24 Mar 2016 00:35:01 GMT
Happy to help.  I will stay involved on the Yarn side too, my hope is any
improvements to drill to facilitate a benefit for drill on yarn can be
abstracted and not just be a drill on yarn feature, but instead, create
hooks to do things (like draining nodes we wish to shutdown, or scale
memory and cup usage up and down) that could be of benefit to both resource
managers.

On Wednesday, March 23, 2016, Paul Rogers <progers@maprtech.com> wrote:

> Hi John,
>
> Thanks for the great info here and in the call a while back. We are
> interested in Mesos as well — some of our folks here argue that Mesos is a
> better fit for Drill. We’d love to hear your experience. Perhaps start a
> new “Drill on Mesos” thread so we can keep the two discussions separate.
>
> We’re looking at YARN first primarily because it seems to come up more
> often: everyone who uses MapReduce is familiar with YARN. Mesos seems a
> more advanced solution adopted by more experienced folks such as yourself.
> We’re hoping that the lessons we learn about managing Drill on YARN can
> transfer to Mesos as well in some later release. Your experience with Mesos
> might help us verify if that hunch is valid.
>
> Thanks,
>
> - Paul
>
>
> > On Mar 23, 2016, at 5:49 AM, John Omernik <john@omernik.com
> <javascript:;>> wrote:
> >
> > Hey Paul and Jacques, Great discussion here. Paul, I believe we met a
> week
> > or two ago on a call.
> >
> > I have been running Drill successfully and powerfully (Multi-tenant etc)
> > using Apache Mesos and Marathon. While I didn't write a framework for
> Drill
> > in Mesos, Marathon does give some very nice capabilities in managing my
> > Drill bits.  Some of the features talked about here would actually be
> > extremely helpful in my Mesos/Marathon work as well if they were built
> into
> > Drill, however, I don't want to muddy the waters by taking away from the
> > Yarn discussion either.    I can go into details on my Mesos setup here
> to
> > get an understanding of how I am approaching things, but like I said, I
> > will do so only on request as to not clutter the conversation.
> >
> > The initial reaction to "ya I want that" would be a way to send a signal
> > somehow, perhaps via rest API to a specific drill bit to enter "Drain
> > Mode". In this case, all currently running queries/fragments continue to
> > execute as expected, but the bit won't be the foreman for any new queries
> > nor should it accept new fragments.  Basically allowing the graceful
> > shutdown Paul spoke of. This would be extremely helpful in shutting nodes
> > down with a minimum of user impact.
> >
> >
> >
> > On Tue, Mar 22, 2016 at 9:42 PM, Paul Rogers <progers@maprtech.com
> <javascript:;>> wrote:
> >
> >> Hi Jacques,
> >>
> >> I’m thinking of “semi-static” allocation at first. Spin up a cluster of
> >> Drill-bits, after which the user can add or remove nodes while the
> cluster
> >> runs. (The add part is easy, the remove part is a bit tricky since we
> don’t
> >> yet have a way to gracefully shut down a Drill-bit.) Once we get the
> basics
> >> to work, we can incrementally try out dynamics. For example, someone
> could
> >> whip up a script to look at load and use the proposed YARN client app to
> >> adjust resources. Later, we can fold dynamic load management into the
> >> solution once we’re sure what folks want.
> >>
> >> I did look at Slider, Twill, Kitten and REEF. Kitten is too basic. I had
> >> great hope for Slider. But, it turns out that Slider and Weave have each
> >> built an elaborate framework to isolate us from YARN. The Slider
> framework
> >> (written in Python) seems harder to understand than YARN itself. At
> least,
> >> one has to be an expert in YARN to understand what all that Python code
> >> does. And, just looking at the class count in the Twill Javadoc was
> >> overwhelming. Slider and Twill have to solve the general case. If we
> build
> >> our own Java solution, we only have to solve the Drill case, which is
> >> likely much simpler.
> >>
> >> A bespoke solution would seem to offer some other advantages. It lets us
> >> do things like integrate ZK monitoring so we can learn of zombie drill
> bits
> >> (haven’t exited, but not sending heartbeat messages.) We can also gather
> >> metrics and historical data about the cluster as a whole. We can try out
> >> different cluster topologies. (Run Drill-bits on x of y nodes on a rack,
> >> say.) And, we can eventually do the dynamic load management we discussed
> >> earlier.
> >>
> >> But first, I look forward to hearing what others have tried and what
> we’ve
> >> learned about how people want to use Drill in a production YARN cluster.
> >>
> >> Thanks,
> >>
> >> - Paul
> >>
> >>
> >>> On Mar 22, 2016, at 5:45 PM, Jacques Nadeau <jacques@dremio.com
> <javascript:;>> wrote:
> >>>
> >>> This is great news, welcome!
> >>>
> >>> What are you thinking in regards to static versus dynamic resource
> >>> allocation? We have some conversations going regarding workload
> >> management
> >>> but they are still early so it seems like starting with user-controlled
> >>> allocation makes sense initially.
> >>>
> >>> Also, have you spent much time evaluating whether one of the existing
> >> YARN
> >>> frameworks such as Slider would be useful? Does anyone on the list have
> >> any
> >>> feedback on the relative merits of these technologies?
> >>>
> >>> Again, glad to see someone picking this up.
> >>>
> >>> Jacques
> >>>
> >>>
> >>> --
> >>> Jacques Nadeau
> >>> CTO and Co-Founder, Dremio
> >>>
> >>> On Tue, Mar 22, 2016 at 4:58 PM, Paul Rogers <progers@maprtech.com
> <javascript:;>>
> >> wrote:
> >>>
> >>>> Hi All,
> >>>>
> >>>> I’m a new member of the Drill Team here at MapR. We’d like to take
a
> >> look
> >>>> at running Drill on YARN for production customers. JIRA suggests some
> >> early
> >>>> work may have been done (DRILL-142 <
> >>>> https://issues.apache.org/jira/browse/DRILL-142>, DRILL-1170 <
> >>>> https://issues.apache.org/jira/browse/DRILL-1170>, DRILL-3675 <
> >>>> https://issues.apache.org/jira/browse/DRILL-3675>).
> >>>>
> >>>> YARN is a complex beast and the Drill community is large and growing.
> >> So,
> >>>> a good place to start is to ask if anyone has already done work on
> >>>> integrating Drill with YARN (see DRILL-142)?  Or has thought about
> what
> >>>> might be needed?
> >>>>
> >>>> DRILL-1170 (YARN support for Drill) seems a good place to gather
> >>>> requirements, designs and so on. I’ve posted a “starter set” of
> >>>> requirements to spur discussion.
> >>>>
> >>>> Thanks,
> >>>>
> >>>> - Paul
> >>>>
> >>>>
> >>
> >>
>
>

-- 
Sent from my iThing

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message