hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marchwiak, Patrick D." <marchwi...@llnl.gov>
Subject Re: Unable to perform list/create after startup
Date Sat, 14 Aug 2010 00:22:49 GMT
I've attached the log.

One more thing I'll add is that the the stop-hbase.sh script hangs hangs on
the "stopping master..." line so I had to manually kill the Hmaster process
before doing a restart.

On 8/13/10 5:00 PM, "Jean-Daniel Cryans" <jdcryans@apache.org> wrote:

> A clean log of a full master startup would be really useful, can't
> tell much more by the current info you provided.
> 
> J-D
> 
> On Fri, Aug 13, 2010 at 4:50 PM, Marchwiak, Patrick D.
> <marchwiak1@llnl.gov> wrote:
>> I am having issues performing any operations (list/create/put) on my hbase
>> instance once it starts up.
>> 
>> The environment:
>> Red Hat 5.5
>> Hadoop 0.20.2
>> HBase 0.20.4
>> java 1.6.0_20
>> 1 running master
>> 23 running regionserver + 3 also running zookeeper
>> 
>> When attemting to do a list from the hbase shell it returns this error:
>> NativeException: org.apache.hadoop.hbase.MasterNotRunningException: null
>> 
>> When attempting to perform inserts from a hadoop job I see the following
>> error in my application:
>> 
>> 2010-08-13 14:03:22.207 INFO  [main] JobClient:1317 Task Id :
>> attempt_201006091333_0031_m_000000_0, Status : FAILED
>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying
>> to locate root region
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootReg
>> ion(HConnectionManager.java:930)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> HConnectionManager.java:581)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegio
>> n(HConnectionManager.java:563)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionI
>> nMeta(HConnectionManager.java:694)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> HConnectionManager.java:590)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegio
>> n(HConnectionManager.java:563)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionI
>> nMeta(HConnectionManager.java:694)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> HConnectionManager.java:594)
>>        at
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> HConnectionManager.java:557)
>>        at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:127)
>> ...
>> 
>> Now contrary to what the shell is reporting, the HMaster process is
>> definitely running (along with HRegionServer and HQuorumPeer on the
>> appropriate other nodes in the cluster). I do not see any errors in the
>> master log, though interestingly I noticed a log message mentioning only 7
>> region servers - in fact there are more than twice that many in the cluster.
>> 
>> 2010-08-13 14:04:32,018 INFO org.apache.hadoop.hbase.master.ServerManager: 7
>> region servers, 0 dead, average load 3.142857142857143
>> 
>> The last clue I have is some exceptions in the zookeeper logs:
>> 
>> 2010-08-13 13:34:16,041 WARN
>> org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
>> processing sessionid:0x12a6d2847e40000 type:create cxid:0x28
>> zxid:0xfffffffffffffffe txntype:unknown n/a
>> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode =
>> NodeExists
>>        at
>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcess
>> or.java:245)
>>        at
>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.ja
>> va:114)
>> 2010-08-13 14:05:08,782 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> Connected to /128.115.210.161:35883 lastZxid 0
>> 2010-08-13 14:05:08,782 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> Creating new session 0x12a6d2847e40001
>> 2010-08-13 14:05:08,800 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> Finished init of 0x12a6d2847e40001 valid:true
>> 2010-08-13 14:05:08,802 WARN
>> org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
>> processing sessionid:0x12a6d2847e40001 type:create cxid:0x1
>> zxid:0xfffffffffffffffe txntype:unknown n/a
>> org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode =
>> NodeExists
>>        at
>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcess
>> or.java:245)
>>        at
>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.ja
>> va:114)
>> 2010-08-13 14:05:09,762 WARN org.apache.zookeeper.server.NIOServerCnxn:
>> Exception causing close of session 0x12a6d2847e40001 due to
>> java.io.IOException: Read error
>> 2010-08-13 14:05:09,763 INFO org.apache.zookeeper.server.NIOServerCnxn:
>> closing session:0x12a6d2847e40001 NIOServerCnxn:
>> java.nio.channels.SocketChannel[connected local=/128.115.210.149:2181
>> remote=/128.115.210.161:35883]
>> 
>> HBase was running on this cluster a few months ago so I doubt it is a
>> blatant misconfiguration at fault. I've tried restarting everything hbase or
>> hadoop related as well as wiping out the hbase data directory on hdfs to
>> start fresh with no result. Any hints or suggestions as to what the problem
>> might be are greatly appreciated. Thanks!
>> 
>> 
>> 
>> 
>> 
>> 
>> 


Mime
View raw message