drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (DRILL-4543) Advertise Drill-bit ports, status, capabilities in ZooKeeper
Date Wed, 30 Mar 2016 18:03:25 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216522#comment-15216522
] 

Paul Rogers edited comment on DRILL-4543 at 3/30/16 6:02 PM:
-------------------------------------------------------------

(Revised to reflect current implemetation as explained below by Jacques. Revised again to
revers list order.) Drill already provides four layers of config:

* -Dname=value system properties from the command line.
* Overrides config file (provided by user)
* Per-module defaults (from class path)
* Defaults config file.

Items higher in the list take precedence over items lower in the list.

A suggestion would be to insert an additional step for selected options:

* System properties
* Environment variables <-- New
* Overrides

The challenge is that env vars cannot use the same syntax as used elsewhere. Perhaps we need
a map of (env var name: system prop name) values. Note that two JVM options already are configured
as env vars in drill-env.sh:

DRILL_MAX_DIRECT_MEMORY="8G"
DRILL_HEAP="4G"

The -Dname=value system properties are handy, but they require assistance from the launch
script to 1) format the properties as system properties, and 2) put them into the right place
on the command line.

It may be easier for tools such as Mesos to set an environment variable and not alter the
command line.

The easiest solution is simply for the drillbit.sh script to special-case the few varibles
(to Jacques' point below) that are needed: ports, etc. If the DRILL_HOST_PORT env var is set,
say, then add the -Dname=value equivalent to the command line.

Another alternative is to allow multiple layers of overrides:

* Drill-bit overrides (new, per-Drill-bit file)
* Site Overrides (existing file)

In this case, the custom properties would be written to a per-drill-bit file that the Drill
bit reads. (Details ommitted for now.)

All three proposals allow Mesos, YARN (or Ansible or other tools) to overide the site-wide
config as needed.

Because we are altering values per-Drill-bit, the values that impact query planning must be
communicated to the other Drill bits. Thus the request that each Drill-bit use ZK to advertise
its actual values as computed using the override rules. Net result: YARN (or Mesos) can adjust
ports and resources, and other Drill-bits can learn of those customizations.

It would also be good, in the Web UI, to display the set of actual values. A bonus would be
to tag each value with its source (defaults, config file name, env, system) to aid admin troubleshooting.
(Once can see actual values today using

SELECT * FROM sys.boot;

But it doesn't show the origin of the value.)

(This JIRA covers only the ZK advertisement, the other details above are provided as background.
We'll the config and reporting items to another JIRA.)


was (Author: paul-rogers):
(Revised to reflect current implemetation as explained below by Jacques.) Drill already provides
four layers of config:

* Defaults config file.
* Per-module defaults (from class path)
* Overrides config file (provided by user)
* -Dname=value system properties from the command line.

Items higher in order take precedence over items lower in the order.

A suggestion would be to insert an additional step:

* Overrides
* Environment variables <-- New
* System properties

The challenge is that env vars cannot use the same syntax as used elsewhere. Perhaps we need
a map of (env var name: system prop name) values.

The -Dname=value system properties are handy, but they require assistance from the launch
script to 1) format the properties as system properties, and 2) put them into the right place
on the command line.

It may be easier for tools such as Mesos to set an environment variable and not alter the
command line.

Another alternative is to allow multiple layers of overrides:

* Site Overrides (existing file)
* Drill-bit overrides (new, per-Drill-bit file)

In this case, the custom properties would be written to a per-drill-bit file that the Drill
bit reads. (Details ommitted for now.)

All three proposals allow Mesos, YARN (or Ansible or other tools) to overide the site-wide
config as needed.

Because we are altering values per-Drill-bit, the values that impact query planning must be
communicated to the other Drill bits. Thus the request that each Drill-bit use ZK to advertise
its actual values as computed using the override rules. Net result: YARN (or Mesos) can adjust
ports and resources, and other Drill-bits can learn of those customizations.

It would also be good, in the Web UI, to display the set of actual values. A bonus would be
to tag each value with its source (defaults, config file name, env, system) to aid admin troubleshooting.
(Once can see actual values today using

SELECT * FROM sys.boot;

But it doesn't show the origin of the value.)

(This JIRA covers only the ZK advertisement, the other details above are provided as background.
We'll the config and reporting items to another JIRA.)

> Advertise Drill-bit ports, status, capabilities in ZooKeeper
> ------------------------------------------------------------
>
>                 Key: DRILL-4543
>                 URL: https://issues.apache.org/jira/browse/DRILL-4543
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components:  Server
>            Reporter: Paul Rogers
>             Fix For: 2.0.0
>
>
> Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, providing
the host name/IP Address of the Drill-bit and the ports used, encoded in Protobuf format.
All other information (status, CPUs, memory) are assumed to be the same across all Drill-bits
in the cluster as specified in the Drill config file. (Amended to reflect 1.6 behavior.)
> Moving forward, as Drill becomes more sophisticated, Drill should advertise the specifics
of each Drill-bit so that one Drill bit can differ from another.
> For example, when running on YARN, we need a way for Drill to gracefully shut down. Advertising
a status of Ready or Unavailable will help. Ready is the normal state. Unavailable means the
Drill-bit will finish in-flight queries, but won't accept new ones. (The actual status is
a separate enhancement.)
> In a YARN cluster, Drill should take advantage of machines with more memory, but live
with machines with less. (Perhaps some are newer, some are older or more heavily loaded.)
Drill should use ZK to identify its available memory and CPUs so that the planner can use
them. (Use of the info is a separate enhancement.)
> There may be times when two drill bits run on a single machine. If so, they must use
separate ports. So, each Drill-bit should advertise its ports in ZK.
> For backward compatibility, the information is optional; if not present, the receiver
should assume the information defaults to that in the config file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message