drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From paul-rogers <...@git.apache.org>
Subject [GitHub] drill issue #922: DRILL-5741: During startup Drill should not exceed the ava...
Date Wed, 30 Aug 2017 17:33:30 GMT
Github user paul-rogers commented on the issue:

    This may be one of those times when we need to resort to a bit of design thinking.
    The core idea is that the user sets one environment variable to check the others. The
first issue is that, if the user can't do the sums to set the Drill memory allocation right
(with respect to actual memory), not sure how they will get the total memory variable right.
    OK, so we get the memory from the system, then do a percentage. That is better. But, what
is the system memory? Is it total memory? Suppose the user says Drill gets 60%. We can now
check. But, Drill is distributed. Newer nodes in a cluster may have 256GB, older nodes 128GB.
Drill demands symmetrical resources so the memory given to Drill must be identical on all
nodes, regardless of system memory. So, the percent of total system memory idea doesn't work
in practice.
    So, maybe we express memory as the total *free* memory. Cool. We give Drill 60%. Drill
starts and everything is fine. Now, we also give Spark 60%. Spark starts. It complains in
its logs (assuming we make this same change to the Spark startup scripts.) But, Spark uses
its memory and causes Drill to fail. We check Drill logs. Nada. We have to check Spark's logs.
Now, imagine doing this with five apps; the app that complains may not be the one to fail.
And, imagine doing this across 100 nodes. Won't scale.
    Note that the problem is that we checked memory statically at startup. But, our problem
was that things changed later: we launched an over-subscribed Spark. So, our script must run
continuously, constantly checking if any new apps are launched. Since some apps grow memory
over time, we have to check all other apps for total memory usage against that allocated to
    Now, presumably, all other apps are doing the same: Spark is continually checking, Storm
is doing so, and so on. Now, the admin needs to gather all these logs (across dozens of nodes)
and extract meaning. What we need, then, is a network endpoint to publish the information
and a tool to gather and report that data. We've just invented monitoring tools.
    Take a step back, what we really want to know is available system memory vs. that consumed
by apps. So, what we want is a Linux-level monitoring of free memory. And, since we have other
things to do, we want alerts when free memory drops below some point. We've now invented alerting
    Now, we got into this mess because we launched apps without concern about the total memory
usage on each node. That is, we didn't plan our app load to fit into our available memory.
So, we turn this around. We've got 128GB (say) of memory. How do we run only those apps that
fit, deferring those that don't? We've just invented YARN, Mesos, Kubernetes and the like.
    Now we get to the reason for the -1. The proposed change adds significant complexity to
the scripts, *but can never solve the actual oversubscription problem*. For that, we need
a global resource manager.
    Now, suppose that someone wants to run Drill without such a manager. Perhaps some distribution
does not provide this tool and instead provides a tool that simply launches processes, leaving
it to each process to struggle with its own resources. In such an environment, the vendor
can add a check, such as this one, that will fire on all nodes and warn the user about potential
oversubscription *on that node*, *at that moment*, *for that app* in *one app's log file*.
    To facilitate this, we can do two things.
    1. In the vendor-specific `distrib-env.sh` file, do any memory setting adjustments that
are wanted.
    2. Modify `drillbit.sh` to call a `drill-check.sh` script, if it exists, just prior to
launching Drill.
    3. In the vendor-specific `distrib-env.sh` file, do the check proposed here.
    The only change needed in Apache Drill is step 2. Then each vendor can add the checks
if they don't provide a resource manager. Those vendors (or users) that use YARN or Mesos
or whatever don't need the checks because they have overall tools that solves the problem
for them.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message