impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Knupp <dkn...@cloudera.com>
Subject Re: impala start error. please help me to solve the troubles. thanks!
Date Wed, 28 Sep 2016 16:01:37 GMT
Hello,

So, as you've noticed, this is a HBase startup issue, which Impala's dev 
environment relies upon. The base cause appears to be this:

     Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: 
master:600000x0, quorum=localhost:2181, baseZNode=/hbase Unexpected 
KeeperException creating base node

Have you researched solutions to this error on HBase forums?

One other thing that may be worth noting is that you are running with 
some old cold. I know this because, as of this commit in April:

https://github.com/cloudera/Impala/commit/6d8c075d9c018883c56ff59c9f07ba0bbfa69873

...we no longer call the script referenced in this log line:

     Error in 
/home/linxiaoyong/impala_new/rtap-on-impala/impala/testdata/bin/run-hbase.sh 
at line 87: ${CLUSTER_BIN}/wait-for-hbase-master.py

Is there any chance you can rebase against the latest version of Impala, 
and try again?

--David
> Linxiaoyong <mailto:linxiaoyong@huawei.com>
> September 27, 2016 at 5:44 PM
> Dear Guys:
>
> Recently we compile impala using our development environment and when 
> we run the complied impala, we met the following problem.
>
> Problem: Impala runs successfully if we do not reboot our machine. 
> However, when we reboot the machine, we cannot restart the impala 
> process. We try a lot of machines, the problem occurs on every machine.
>
> We struggle for a long time , but it still does not work. We are 
> wondering whether you guys can help us to solve the problem.
>
> The environment and error message is as follows.
>
> environment<javascript:void(0);>:
> OS: Distributor ID: CentOS
> Description: CentOS Linux release 7.2.1511 (Core)
> Release: 7.2.1511
> Codename: Core
> Kernel:Linux version 3.10.0-327.28.2.el7.x86_64
> Impala version: cdh5-trunk
>
>
> 1. We start Impala using: ${IMPALA_HOME}/testdata/bin/run-all.sh, and 
> get the following message.
> [root@localhost rtap-on-impala]# ${IMPALA_HOME}/testdata/bin/run-all.sh
> Killing running services...
> Starting all cluster services...
> --> Starting mini-DFS cluster
> Stopping kms
> Stopping llama
> Stopping yarn
> Stopping hdfs
> Starting hdfs (Web UI - http://localhost:5070)
> ....Namenode started
> Starting yarn (Web UI - http://localhost:8088)
> Starting llama (Web UI - http://localhost:1501)
> Starting kms (Web UI - http://localhost:16000)
> The cluster is running
> --> Starting HBase
> localhost: starting zookeeper, logging to 
> /home/linxiaoyong/impala_new/rtap-on-impala/impala/cluster_logs/hbase/hbase-root-zookeeper-localhost.localdomain.out
> starting master, logging to 
> /home/linxiaoyong/impala_new/rtap-on-impala/impala/cluster_logs/hbase/hbase-root-master-localhost.localdomain.out
> 16/09/28 17:15:52 INFO util.VersionInfo: HBase 1.2.0-cdh5.8.0-SNAPSHOT
> 16/09/28 17:15:52 INFO util.VersionInfo: Source code repository 
> file:///var/lib/jenkins/workspace/generic-binary-tarball-and-maven-deploy/CDH5-Packaging-HBase-2016-02-24_17-14-20/hbase-1.2.0-cdh5.8.0-SNAPSHOT

> revision=Unknown
> 16/09/28 17:15:52 INFO util.VersionInfo: Compiled by jenkins on Wed 
> Feb 24 17:26:12 PST 2016
> 16/09/28 17:15:52 INFO util.VersionInfo: From source with checksum 
> 2c2f0626ababf9b47e88728c663df5c7
> Waiting for HBase Master
> ...........................Failure
> Hbase master did NOT write /hbase/rs in 30.4s
> Error in 
> /home/linxiaoyong/impala_new/rtap-on-impala/impala/testdata/bin/run-hbase.sh 
> at line 87: ${CLUSTER_BIN}/wait-for-hbase-master.py
> Error in 
> /home/linxiaoyong/impala_new/rtap-on-impala/impala/testdata/bin/run-all.sh 
> at line 48: tee ${IMPALA_TEST_CLUSTER_LOG_DIR}/run-hbase.log
>
>
>
>
> 2. Vim cluster_logs/hbase/hbase-root-master-localhost.localdomain.out
> Errors follow as:
>
> 16/09/28 17:16:10 INFO zookeeper.ClientCnxn: Opening socket connection 
> to server localhost/127.0.0.1:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 16/09/28 17:16:10 WARN zookeeper.ClientCnxn: Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 16/09/28 17:16:11 INFO zookeeper.ClientCnxn: Opening socket connection 
> to server localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to 
> authenticate using SASL (unknown error)
> 16/09/28 17:16:11 WARN zookeeper.ClientCnxn: Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 16/09/28 17:16:11 INFO zookeeper.ClientCnxn: Opening socket connection 
> to server localhost/127.0.0.1:2181. Will not attempt to authenticate 
> using SASL (unknown error)
> 16/09/28 17:16:11 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper 
> create failed after 4 attempts
> 16/09/28 17:16:11 WARN zookeeper.ClientCnxn: Session 0x0 for server 
> null, unexpected error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
> 16/09/28 17:16:11 ERROR master.HMasterCommandLine: Master exiting
> java.lang.RuntimeException: Failed construction of Master: class 
> org.apache.hadoop.hbase.master.HMaster.
> at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2428)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:232)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2438)
> Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: 
> master:600000x0, quorum=localhost:2181, baseZNode=/hbase Unexpected 
> KeeperException creating base node
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:206)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:187)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:590)
> at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:375)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2421)
> ... 5 more
>
>
>
>
>
>
> I used “jps” to watch the processes like as:
>
> [root@localhost rtap-on-impala]# jps
> 26528 LlamaAMMain
> 25921 NodeManager
> 25186 DataNode
> 25890 NodeManager
> 29188 Jps
> 25221 DataNode
> 25864 NodeManager
> 25162 DataNode
> 26635 Bootstrap
> 14194 -- process information unavailable
> 25246 NameNode
> 25950 ResourceManager
> 27423 HQuorumPeer
>
>
>
>

-- 
David Knupp
Software Engineer
Cloudera
415-312-1049
<https://www.postbox-inc.com/?utm_source=email&utm_medium=siglink&utm_campaign=reach>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message