impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-4733) HBase/Zookeeper continues to be flaky when starting the minicluster on RHEL7
Date Tue, 18 Apr 2017 21:05:41 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-4733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Taras Bobrovytsky resolved IMPALA-4733.
---------------------------------------
    Resolution: Fixed

The new HBase issue we are seeing is different from this one. Opened IMPALA-5223.

> HBase/Zookeeper continues to be flaky when starting the minicluster on RHEL7
> ----------------------------------------------------------------------------
>
>                 Key: IMPALA-4733
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4733
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Infrastructure
>    Affects Versions: Impala 2.8.0
>            Reporter: David Knupp
>            Assignee: Lars Volker
>            Priority: Critical
>              Labels: broken-build
>             Fix For: Impala 2.9.0
>
>         Attachments: hbase_logs.tgz
>
>
> HBase startup continues to be problematic on RHEL7. (We recently beefed up our error
checking with IMPALA-4684, but this didn't address the underlying flakiness.)
> An example from a recent failure shows both Zookeeper flakiness (note the connection
loss/retry) and a failure to launch HBase:
> {noformat}
> 21:31:20 Connecting to Zookeeper host(s).
> 21:31:27 Success: <kazoo.client.KazooClient object at 0x13e2590>
> 21:31:27 Waiting for HBase node: /hbase/master
> 21:31:33 No handlers could be found for logger "kazoo.client"
> 21:31:33 Zookeeper connection loss: retrying connection (1 of 3 attempts)
> 21:31:34 Stopping Zookeeper client
> 21:31:39 Connecting to Zookeeper host(s).
> 21:31:46 Success: <kazoo.client.KazooClient object at 0x7f49109b3210>
> 21:31:46 Waiting for HBase node: /hbase/master
> 21:31:51 Waiting for HBase node: /hbase/master
> 21:31:55 Waiting for HBase node: /hbase/master
> 21:31:56 Waiting for HBase node: /hbase/master
> 21:32:04 Waiting for HBase node: /hbase/master
> 21:32:09 Waiting for HBase node: /hbase/master
> 21:32:28 Waiting for HBase node: /hbase/master
> 21:32:28 Waiting for HBase node: /hbase/master
> 21:32:28 Failed while checking for HBase node: /hbase/master
> 21:32:28 Waiting for HBase node: /hbase/rs
> 21:32:28 Success: /hbase/rs
> 21:32:28 Stopping Zookeeper client
> 21:32:28 Could not get one or more nodes. Exiting with errors: 1
> {noformat}
> When I look at the end of the output of the HBase master node, it's ends here:
> {noformat}
> 17/01/04 21:32:17 INFO zookeeper.ClientCnxn: Socket connection established, initiating
session, client: /127.0.0.1:51485, server: localhost/127.0.0.1:2181
> 17/01/04 21:32:23 INFO zookeeper.ClientCnxn: Session establishment complete on server
localhost/127.0.0.1:2181, sessionid = 0x1596d1c12ee0009, negotiated timeout = 90000
> 17/01/04 21:32:23 INFO client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
> 17/01/04 21:32:23 INFO master.ActiveMasterManager: Deleting ZNode for /hbase/backup-masters/localhost,60000,1483594276304
from backup master directory
> {noformat}
> If we look at a successful HBase startup from another test run, the output looks like
this:
> {noformat}
> 16/12/29 20:27:00 INFO zookeeper.ClientCnxn: Socket connection established, initiating
session, client: /127.0.0.1:59932, server: localhost/127.0.0.1:2181
> 16/12/29 20:27:00 INFO zookeeper.ClientCnxn: Session establishment complete on server
localhost/127.0.0.1:2181, sessionid = 0x1594dfb07910002, negotiated timeout = 90000
> 16/12/29 20:27:00 INFO client.ZooKeeperRegistry: ClusterId read in ZooKeeper is null
> 16/12/29 20:27:01 INFO util.FSUtils: Created version file at hdfs://localhost:20500/hbase
with version=8
> 16/12/29 20:27:02 INFO master.MasterFileSystem: BOOTSTRAP: creating hbase:meta region
> 16/12/29 20:27:02 INFO regionserver.HRegion: creating HRegion hbase:meta HTD == 'hbase:meta',
{TABLE_ATTRIBUTES => {IS_META => 'true', coprocessor$1 => '|org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint|536870911|'},
{NAME => 'info', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE
=> '0', COMPRESSION => 'NONE', VERSIONS => '10', TTL => 'FOREVER', MIN_VERSIONS
=> '0', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => '8192', IN_MEMORY => 'false',
BLOCKCACHE => 'false'} RootDir = hdfs://localhost:20500/hbase Table name == hbase:meta
> 16/12/29 20:27:02 INFO hfile.CacheConfig: blockCache=LruBlockCache{blockCount=0, currentSize=6566880,
freeSize=6396457504, maxSize=6403024384, heapSize=6566880, minSize=6082873344, minFactor=0.95,
multiSize=3041436672, multiFactor=0.5, singleSize=1520718336, singleFactor=0.25}, cacheDataOnRead=false,
cacheDataOnWrite=false, cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false,
cacheDataCompressed=false, prefetchOnOpen=false
> [etc...]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message