hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From oc tsdb <oc.t...@gmail.com>
Subject Re: getting splitmanager debug logs continuously
Date Thu, 08 Aug 2013 21:41:34 GMT
I am using hbase-0.92

Region server was not running on any of the nodes.

Restarted the cluster. It started region server on all nodes except HMaster
but still unresponsive.

processes running on master are
TSDMain
HMaster
SecondaryNameNode
NameNode
JobTracker
HQuorumPeer

processes running on all other nodes are
DataNode
TaskTracker
RegionServer
TSDMain

This time, I see the error messages in the attached log.

Could you please suggest if I can recover/restore the data and get the
cluster up.

Thanks & Regards,
VSR



On Thu, Aug 8, 2013 at 1:40 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Can you tell us the version of HBase you're using ?
>
> Do you find something in region server logs on the 4 remaining nodes ?
>
> Cheers
>
> On Thu, Aug 8, 2013 at 1:36 PM, oc tsdb <oc.tsdb@gmail.com> wrote:
>
> > Hi,
> >
> > I am running a cluster with 6 nodes;
> > Two of 6 nodes in my cluster went down (due to other application failure)
> > and came back after some time (had to do a power reboot).
> > When these nodes are back I use to get "WARN org.apache.hadoop.DFSClient:
> > Failed to connect to , add to deadnodes and continue".
> > Now these messages are stopped and getting continuous debug message as
> > follows.
> >
> > 2013-08-08 12:57:36,628 DEBUG org.apache.hadoop.hbase.
> > master.SplitLogManager: total tasks = 14 unassigned = 14
> > 2013-08-08 12:57:37,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%2Fmb-3.corp.oc.com
> > %252C60020%252C1375466447768.1375631802971
> > ver = 0
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> > %252C60020%252C1375466460755.1375623787557
> > ver = 0
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> > %252C60020%252C1375466460755.1375619231059
> > ver = 3
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-2.corp.oc.com%2C60020%2C1375466479427-splitting%2Fmb-2.corp.oc.com
> > %252C60020%252C1375466479427.1375639017535
> > ver = 0
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> > %252C60020%252C1375466460755.1375623021175
> > ver = 0
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%2Fmb-3.corp.oc.com
> > %252C60020%252C1375466447768.1375630425141
> > ver = 0
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: resubmitting unassigned
> > task(s) after timeout
> > 2013-08-08 12:57:37,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> > %252C60020%252C1375466460755.1375620714514
> > ver = 3
> > 2013-08-08 12:57:37,630 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-6.corp.oc.com%2C60020%2C1375924525310-splitting%2Fmb-6.corp.oc.com
> > %252C60020%252C1375924525310.1375924529658
> > ver = 0
> > 2013-08-08 12:57:37,630 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-4.corp.oc.com%2C60020%2C1375466551673-splitting%2Fmb-4.corp.oc.com
> > %252C60020%252C1375466551673.1375641592581
> > ver = 0
> > 2013-08-08 12:57:37,630 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-5.corp.oc.com%2C60020%2C1375924528073-splitting%2Fmb-5.corp.oc.com
> > %252C60020%252C1375924528073.1375924532442
> > ver = 0
> > 2013-08-08 12:57:37,630 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%2Fmb-6.corp.oc.com
> > %252C60020%252C1375466460755.1375622290167
> > ver = 3
> > 2013-08-08 12:57:37,630 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%2Fmb-5.corp.oc.com
> > %252C60020%252C1375466463385.1375638183425
> > ver = 0
> > 2013-08-08 12:57:37,630 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%2Fmb-5.corp.oc.com
> > %252C60020%252C1375466463385.1375639599559
> > ver = 0
> > 2013-08-08 12:57:37,630 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com%3A54310%2Fhbase%2F.logs%
> > 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%2Fmb-5.corp.oc.com
> > %252C60020%252C1375466463385.1375641710787
> > ver = 3
> > 2013-08-08 12:57:37,633 INFO
> > org.apache.hadoop.hbase.master.SplitLogManager: task
> > /hbase/splitlog/RESCAN0000006975 entered state done mb-1.corp.oc.com
> > ,60000,1375924508669
> > 2013-08-08 12:57:37,633 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback:
> deleted
> > /hbase/splitlog/RESCAN0000006975
> > 2013-08-08 12:57:37,633 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: deleted task without in
> > memory state /hbase/splitlog/RESCAN0000006975
> > 2013-08-08 12:57:38,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:39,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:40,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:41,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:42,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:43,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:44,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:45,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:46,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:47,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:48,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:49,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:50,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:51,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:52,628 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:53,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:54,487 DEBUG
> >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> > Lookedup root region location,
> >
> >
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@24ddb5c9
> > ;
> > serverName=
> > 2013-08-08 12:57:54,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:55,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:56,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:57,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:58,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:57:59,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> > 2013-08-08 12:58:00,629 DEBUG
> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> unassigned
> > = 14
> >
> >
> > The cluster is unresponsive. I cannot access 4242 port on any of the
> > cluster nodes.
> > When I try to run tsdb command "tsdb uig grep metrics .", i am getting
> > following error messages
> >   ERROR [main-EventThread] HBaseClient: The znode for the -ROOT- region
> > doesn't exist!
> >   ERROR [main-EventThread] HBaseClient: The znode for the -ROOT- region
> > doesn't exist!
> >
> > Could you please suggest me what I can do to stop it.
> >
> > Thanks in Advance.
> >
> > Regards,
> > OC.
> >
>

Mime
View raw message