hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-12167) HBase metastore causes massive number of ZK exceptions in MiniTez tests
Date Wed, 14 Oct 2015 01:48:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14956113#comment-14956113
] 

Sergey Shelukhin edited comment on HIVE-12167 at 10/14/15 1:47 AM:
-------------------------------------------------------------------

That's because config management for HBase metastore is terrible and involves a static and
a threadlocal.
So first the test inits the static and one proper threadlocal.
Then some other random thread inits its own threadlocal with its own unrelated conf (for everyone)
and sets its threadlocal to incorrect value.


was (Author: sershe):
That's because config management for HBase metastore is terrible and involves a static and
a threadlocal.
So first the test inits the static and one proper threadlocal.
Then some other random thread inits its own threadlocal with its own unrelated conf (for everyone)
and sets its threadlocal to incorrect value.

> HBase metastore causes massive number of ZK exceptions in MiniTez tests
> -----------------------------------------------------------------------
>
>                 Key: HIVE-12167
>                 URL: https://issues.apache.org/jira/browse/HIVE-12167
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> I ran some random test (vectorization_10) with HBase metastore for unrelated reason,
and I see large number of exceptions in hive.log
> {noformat}
> $ grep -c "ConnectionLoss" hive.log
> 52
> $ grep -c "Connection refused" hive.log
> 1014
> {noformat}
> These log lines' count has increased by ~33% since merging llap branch, but it is still
high before that (39/~700) for the same test). These lines are not present if I disable HBase
metastore.
> The exceptions are:
> {noformat}
> 2015-10-13T17:51:06,232 WARN  [Thread-359-SendThread(localhost:2181)]: zookeeper.ClientCnxn
(ClientCnxn.java:run(1102)) - Session 0x0 for server null, unexpected error, closing socket
connection and attempting reconnect
> java.net.ConnectException: Connection refused
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_45]
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_45]
> 	at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
~[zookeeper-3.4.6.jar:3.4.6-1569965]
> 	at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) [zookeeper-3.4.6.jar:3.4.6-1569965]
> {noformat}
> that is retried for some seconds and then
> {noformat}
> 2015-10-13T17:51:22,867 WARN  [Thread-359]: zookeeper.ZKUtil (ZKUtil.java:checkExists(544))
- hconnection-0x1da6ef180x0, quorum=localhost:2181, baseZNode=/hbase Unable to set watcher
on znode (/hbase/hbaseid)
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/hbaseid
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> 	at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
~[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:541) [hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:105)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:879)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635)
[hbase-client-1.1.1.jar:1.1.1]
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_45]
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
[?:1.8.0_45]
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
[?:1.8.0_45]
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:422) [?:1.8.0_45]
> 	at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:420)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:329)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hive.metastore.hbase.VanillaHBaseConnection.connect(VanillaHBaseConnection.java:56)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.<init>(HBaseReadWrite.java:227)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.<init>(HBaseReadWrite.java:83)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:157)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite$1.initialValue(HBaseReadWrite.java:151)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at java.lang.ThreadLocal.setInitialValue(ThreadLocal.java:180) [?:1.8.0_45]
> 	at java.lang.ThreadLocal.get(ThreadLocal.java:170) [?:1.8.0_45]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.getInstance(HBaseReadWrite.java:205)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.StatsCache$Invalidator.run(StatsCache.java:309)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> {noformat}
> or (note this one is after the connection was already created)
> {noformat}
> 2015-10-13T17:51:58,134 WARN  [Thread-359]: zookeeper.ZKUtil (ZKUtil.java:getData(753))
- hconnection-0x1da6ef180x0, quorum=localhost:2181, baseZNode=/hbase Unable to get data of
znode /hbase/meta-region-server
> org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
for /hbase/meta-region-server
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> 	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> 	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155) ~[zookeeper-3.4.6.jar:3.4.6-1569965]
> 	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:360)
~[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:745) [hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionState(MetaTableLocator.java:482)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionLocation(MetaTableLocator.java:168)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:600)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:580)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:559)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1185)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1152)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:211)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:185)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1249)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1155)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:61)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) [hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
[hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:811) [hbase-client-1.1.1.jar:1.1.1]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.scan(HBaseReadWrite.java:2046)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.scan(HBaseReadWrite.java:2027)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.HBaseReadWrite.invalidateAggregatedStats(HBaseReadWrite.java:1707)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> 	at org.apache.hadoop.hive.metastore.hbase.StatsCache$Invalidator.run(StatsCache.java:309)
[hive-metastore-2.0.0-SNAPSHOT.jar:?]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message