drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Rogers <prog...@maprtech.com>
Subject Re: Drill on YARN
Date Wed, 23 Mar 2016 02:42:45 GMT
Hi Jacques,

I’m thinking of “semi-static” allocation at first. Spin up a cluster of Drill-bits,
after which the user can add or remove nodes while the cluster runs. (The add part is easy,
the remove part is a bit tricky since we don’t yet have a way to gracefully shut down a
Drill-bit.) Once we get the basics to work, we can incrementally try out dynamics. For example,
someone could whip up a script to look at load and use the proposed YARN client app to adjust
resources. Later, we can fold dynamic load management into the solution once we’re sure
what folks want.

I did look at Slider, Twill, Kitten and REEF. Kitten is too basic. I had great hope for Slider.
But, it turns out that Slider and Weave have each built an elaborate framework to isolate
us from YARN. The Slider framework (written in Python) seems harder to understand than YARN
itself. At least, one has to be an expert in YARN to understand what all that Python code
does. And, just looking at the class count in the Twill Javadoc was overwhelming. Slider and
Twill have to solve the general case. If we build our own Java solution, we only have to solve
the Drill case, which is likely much simpler. 

A bespoke solution would seem to offer some other advantages. It lets us do things like integrate
ZK monitoring so we can learn of zombie drill bits (haven’t exited, but not sending heartbeat
messages.) We can also gather metrics and historical data about the cluster as a whole. We
can try out different cluster topologies. (Run Drill-bits on x of y nodes on a rack, say.)
And, we can eventually do the dynamic load management we discussed earlier.

But first, I look forward to hearing what others have tried and what we’ve learned about
how people want to use Drill in a production YARN cluster.


- Paul 

> On Mar 22, 2016, at 5:45 PM, Jacques Nadeau <jacques@dremio.com> wrote:
> This is great news, welcome!
> What are you thinking in regards to static versus dynamic resource
> allocation? We have some conversations going regarding workload management
> but they are still early so it seems like starting with user-controlled
> allocation makes sense initially.
> Also, have you spent much time evaluating whether one of the existing YARN
> frameworks such as Slider would be useful? Does anyone on the list have any
> feedback on the relative merits of these technologies?
> Again, glad to see someone picking this up.
> Jacques
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> On Tue, Mar 22, 2016 at 4:58 PM, Paul Rogers <progers@maprtech.com> wrote:
>> Hi All,
>> I’m a new member of the Drill Team here at MapR. We’d like to take a look
>> at running Drill on YARN for production customers. JIRA suggests some early
>> work may have been done (DRILL-142 <
>> https://issues.apache.org/jira/browse/DRILL-142>, DRILL-1170 <
>> https://issues.apache.org/jira/browse/DRILL-1170>, DRILL-3675 <
>> https://issues.apache.org/jira/browse/DRILL-3675>).
>> YARN is a complex beast and the Drill community is large and growing. So,
>> a good place to start is to ask if anyone has already done work on
>> integrating Drill with YARN (see DRILL-142)?  Or has thought about what
>> might be needed?
>> DRILL-1170 (YARN support for Drill) seems a good place to gather
>> requirements, designs and so on. I’ve posted a “starter set” of
>> requirements to spur discussion.
>> Thanks,
>> - Paul

View raw message