hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20794) Use Zookeeper for metastore service discovery
Date Mon, 26 Nov 2018 09:43:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698682#comment-16698682
] 

Hive QA commented on HIVE-20794:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12949455/HIVE-20794.05

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15624 tests executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=195)
	[druidmini_test_ts.q,druidmini_expressions.q,druid_timestamptz2.q,druidmini_test_alter.q,druidkafkamini_csv.q]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] (batchId=171)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/15054/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15054/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15054/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12949455 - PreCommit-HIVE-Build

> Use Zookeeper for metastore service discovery
> ---------------------------------------------
>
>                 Key: HIVE-20794
>                 URL: https://issues.apache.org/jira/browse/HIVE-20794
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ashutosh Bapat
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-20794.01, HIVE-20794.02, HIVE-20794.03, HIVE-20794.03, HIVE-20794.04,
HIVE-20794.05
>
>
> Right now, multiple metastore services can be specified in hive.metastore.uris configuration,
but that list is static and can not be modified dynamically. Use Zookeeper for dynamic service
discovery of metastore.
> h3. Improve ZooKeeperHiveHelper class (suggestions for name welcome)
> The Zookeeper related code (for service discovery) accesses Zookeeper parameters directly
from HiveConf. The class is changed so that it could be used for both HiveServer2 and Metastore
server and works with both the configurations. Following methods from HiveServer2 are now
moved into ZooKeeperHiveHelper. # startZookeeperClient # addServerInstanceToZooKeeper # removeServerInstanceFromZooKeeper
> h3. HiveMetaStore conf changes
>  # THRIFT_URIS (hive.metastore.uris) can also be used to specify ZooKeeper quorum. When
THRIFT_SERVICE_DISCOVERY_MODE (hive.metastore.service.discovery.mode) is set to "zookeeper"
the URIs are used as ZooKeeper quorum. When it's set to be empty, the URIs are used to locate
the metastore directly.
>  # Here's list of Hiveserver2's parameters and their proposed metastore conf counterparts.
It looks odd that the Metastore related configurations do not have their macros start with
METASTORE, but start with THRIFT. I have just followed naming convention used for other parameters.
>  ** HIVE_SERVER2_ZOOKEEPER_NAMESPACE - THRIFT_ZOOKEEPER_NAMESPACE (hive.metastore.zookeeper.namespace)
>  ** HIVE_ZOOKEEPER_CLIENT_PORT - THRIFT_ZOOKEEPER_CLIENT_PORT (hive.metastore.zookeeper.client.port)
>  ** HIVE_ZOOKEEPER_CONNECTION_TIMEOUT - THRIFT_ZOOKEEPER_CONNECTION_TIMEOUT - (hive.metastore.zookeeper.connection.timeout)
>  ** HIVE_ZOOKEEPER_CONNECTION_MAX_RETRIES - THRIFT_ZOOKEEPER_CONNECTION_MAX_RETRIES (hive.metastore.zookeeper.connection.max.retries)
>  ** HIVE_ZOOKEEPER_CONNECTION_BASESLEEPTIME - THRIFT_ZOOKEEPER_CONNECTION_BASESLEEPTIME
(hive.metastore.zookeeper.connection.basesleeptime)
>  # Additional configuration THRIFT_BIND_HOST is used to specify the host address to
bind Metastore service to. Right now Metastore binds to *, i.e all addresses. Metastore doesn't
then know which of those addresses it should add to the ZooKeeper. THRIFT_BIND_HOST solves
that problem. When this configuration is specified the metastore server binds to that address
and also adds it to the ZooKeeper if dynamic service discovery mode is ZooKeeper.
> Following Hive ZK configurations seem to be related to managing locks and seem irrelevant
for MS ZK.
>  # HIVE_ZOOKEEPER_SESSION_TIMEOUT
>  # HIVE_ZOOKEEPER_CLEAN_EXTRA_NODES
> Since there is no configuration to be published, HIVE_ZOOKEEPER_PUBLISH_CONFIGS does
not have a THRIFT counterpart.
> h3. HiveMetaStore class changes
>  # startMetaStore should also register the instance with Zookeeper, when configured.
>  # When shutting a metastore server down it should deregister itself from Zookeeper,
when configured.
>  # These changes use the refactored code described above.
> h3. HiveMetaStoreClient class changes
> When service discovery mode is zookeeper, we fetch the metatstore URIs from the specified
ZooKeeper and treat those as if they were specified in THRIFT_URIS i.e. use the existing mechanisms
to choose a metastore server to connect to and establish a connection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message