hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Unable to perform list/create after startup
Date Wed, 18 Aug 2010 23:24:53 GMT
My question was answered by J-D in another thread.

Regards

On Wed, Aug 18, 2010 at 3:12 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> We use HBASE 0.20.6 with HBASE-2473
> I think we may have hit HBASE-2599
>
> I am looking at 2599-0.20.txt<https://issues.apache.org/jira/secure/attachment/12445536/2599-0.20.txt>which
you attached to the JIRA.
>
> I cannot find how to apply this change for HRegionServer.java:
>
> -                    serverInfo.setStartCode(System.currentTimeMillis());
> +                    this.serverInfo =
> +                      createServerInfoWithNewStartCode(this.serverInfo);
>
> I only found one call of the following form at line 776 in protected void
> init(final MapWritable c):
> this.hlogFlusher.setHLog(hlog)
> ;
>
> If someone can help me apply the patch, that would be great.
>
>
> On Fri, Aug 13, 2010 at 5:36 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> Ah very helpful, see how .META. is getting reassigned even if it has a
>> valid assignment? Some environments get this for some reason, and this
>> is fixed by https://issues.apache.org/jira/browse/HBASE-2599 which you
>> will need to apply on your hbase.
>>
>> J-D
>>
>> On Fri, Aug 13, 2010 at 5:22 PM, Marchwiak, Patrick D.
>> <marchwiak1@llnl.gov> wrote:
>> > I've attached the log.
>> >
>> > One more thing I'll add is that the the stop-hbase.sh script hangs hangs
>> on
>> > the "stopping master..." line so I had to manually kill the Hmaster
>> process
>> > before doing a restart.
>> >
>> > On 8/13/10 5:00 PM, "Jean-Daniel Cryans" <jdcryans@apache.org> wrote:
>> >
>> >> A clean log of a full master startup would be really useful, can't
>> >> tell much more by the current info you provided.
>> >>
>> >> J-D
>> >>
>> >> On Fri, Aug 13, 2010 at 4:50 PM, Marchwiak, Patrick D.
>> >> <marchwiak1@llnl.gov> wrote:
>> >>> I am having issues performing any operations (list/create/put) on my
>> hbase
>> >>> instance once it starts up.
>> >>>
>> >>> The environment:
>> >>> Red Hat 5.5
>> >>> Hadoop 0.20.2
>> >>> HBase 0.20.4
>> >>> java 1.6.0_20
>> >>> 1 running master
>> >>> 23 running regionserver + 3 also running zookeeper
>> >>>
>> >>> When attemting to do a list from the hbase shell it returns this
>> error:
>> >>> NativeException: org.apache.hadoop.hbase.MasterNotRunningException:
>> null
>> >>>
>> >>> When attempting to perform inserts from a hadoop job I see the
>> following
>> >>> error in my application:
>> >>>
>> >>> 2010-08-13 14:03:22.207 INFO  [main] JobClient:1317 Task Id :
>> >>> attempt_201006091333_0031_m_000000_0, Status : FAILED
>> >>> org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
>> trying
>> >>> to locate root region
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRootReg
>> >>> ion(HConnectionManager.java:930)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> >>> HConnectionManager.java:581)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegio
>> >>> n(HConnectionManager.java:563)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionI
>> >>> nMeta(HConnectionManager.java:694)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> >>> HConnectionManager.java:590)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.relocateRegio
>> >>> n(HConnectionManager.java:563)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegionI
>> >>> nMeta(HConnectionManager.java:694)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> >>> HConnectionManager.java:594)
>> >>>        at
>> >>>
>> org.apache.hadoop.hbase.client.HConnectionManager$TableServers.locateRegion(
>> >>> HConnectionManager.java:557)
>> >>>        at
>> org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:127)
>> >>> ...
>> >>>
>> >>> Now contrary to what the shell is reporting, the HMaster process is
>> >>> definitely running (along with HRegionServer and HQuorumPeer on the
>> >>> appropriate other nodes in the cluster). I do not see any errors in
>> the
>> >>> master log, though interestingly I noticed a log message mentioning
>> only 7
>> >>> region servers - in fact there are more than twice that many in the
>> cluster.
>> >>>
>> >>> 2010-08-13 14:04:32,018 INFO
>> org.apache.hadoop.hbase.master.ServerManager: 7
>> >>> region servers, 0 dead, average load 3.142857142857143
>> >>>
>> >>> The last clue I have is some exceptions in the zookeeper logs:
>> >>>
>> >>> 2010-08-13 13:34:16,041 WARN
>> >>> org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
>> >>> processing sessionid:0x12a6d2847e40000 type:create cxid:0x28
>> >>> zxid:0xfffffffffffffffe txntype:unknown n/a
>> >>> org.apache.zookeeper.KeeperException$NodeExistsException:
>> KeeperErrorCode =
>> >>> NodeExists
>> >>>        at
>> >>>
>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcess
>> >>> or.java:245)
>> >>>        at
>> >>>
>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.ja
>> >>> va:114)
>> >>> 2010-08-13 14:05:08,782 INFO
>> org.apache.zookeeper.server.NIOServerCnxn:
>> >>> Connected to /128.115.210.161:35883 lastZxid 0
>> >>> 2010-08-13 14:05:08,782 INFO
>> org.apache.zookeeper.server.NIOServerCnxn:
>> >>> Creating new session 0x12a6d2847e40001
>> >>> 2010-08-13 14:05:08,800 INFO
>> org.apache.zookeeper.server.NIOServerCnxn:
>> >>> Finished init of 0x12a6d2847e40001 valid:true
>> >>> 2010-08-13 14:05:08,802 WARN
>> >>> org.apache.zookeeper.server.PrepRequestProcessor: Got exception when
>> >>> processing sessionid:0x12a6d2847e40001 type:create cxid:0x1
>> >>> zxid:0xfffffffffffffffe txntype:unknown n/a
>> >>> org.apache.zookeeper.KeeperException$NodeExistsException:
>> KeeperErrorCode =
>> >>> NodeExists
>> >>>        at
>> >>>
>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcess
>> >>> or.java:245)
>> >>>        at
>> >>>
>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.ja
>> >>> va:114)
>> >>> 2010-08-13 14:05:09,762 WARN
>> org.apache.zookeeper.server.NIOServerCnxn:
>> >>> Exception causing close of session 0x12a6d2847e40001 due to
>> >>> java.io.IOException: Read error
>> >>> 2010-08-13 14:05:09,763 INFO
>> org.apache.zookeeper.server.NIOServerCnxn:
>> >>> closing session:0x12a6d2847e40001 NIOServerCnxn:
>> >>> java.nio.channels.SocketChannel[connected local=/128.115.210.149:2181
>> >>> remote=/128.115.210.161:35883]
>> >>>
>> >>> HBase was running on this cluster a few months ago so I doubt it is
a
>> >>> blatant misconfiguration at fault. I've tried restarting everything
>> hbase or
>> >>> hadoop related as well as wiping out the hbase data directory on hdfs
>> to
>> >>> start fresh with no result. Any hints or suggestions as to what the
>> problem
>> >>> might be are greatly appreciated. Thanks!
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message