hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From oc tsdb <oc.t...@gmail.com>
Subject Re: getting splitmanager debug logs continuously
Date Thu, 08 Aug 2013 22:50:39 GMT
all the nodes are running but master does not run region-server; master was
limited to run nameNode, quorum, and HMaster functionality.
you mean to run region server on Master node as well?


On Thu, Aug 8, 2013 at 2:48 PM, Jimmy Xiang <jxiang@cloudera.com> wrote:

> Can you start the master as well (besides region servers)?
>
>
> On Thu, Aug 8, 2013 at 2:41 PM, oc tsdb <oc.tsdb@gmail.com> wrote:
>
> > I am using hbase-0.92
> >
> > Region server was not running on any of the nodes.
> >
> > Restarted the cluster. It started region server on all nodes except
> > HMaster but still unresponsive.
> >
> > processes running on master are
> > TSDMain
> > HMaster
> > SecondaryNameNode
> > NameNode
> > JobTracker
> > HQuorumPeer
> >
> > processes running on all other nodes are
> > DataNode
> > TaskTracker
> > RegionServer
> > TSDMain
> >
> > This time, I see the error messages in the attached log.
> >
> > Could you please suggest if I can recover/restore the data and get the
> > cluster up.
> >
> > Thanks & Regards,
> > VSR
> >
> >
> >
> > On Thu, Aug 8, 2013 at 1:40 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> >> Can you tell us the version of HBase you're using ?
> >>
> >> Do you find something in region server logs on the 4 remaining nodes ?
> >>
> >> Cheers
> >>
> >> On Thu, Aug 8, 2013 at 1:36 PM, oc tsdb <oc.tsdb@gmail.com> wrote:
> >>
> >> > Hi,
> >> >
> >> > I am running a cluster with 6 nodes;
> >> > Two of 6 nodes in my cluster went down (due to other application
> >> failure)
> >> > and came back after some time (had to do a power reboot).
> >> > When these nodes are back I use to get "WARN
> >> org.apache.hadoop.DFSClient:
> >> > Failed to connect to , add to deadnodes and continue".
> >> > Now these messages are stopped and getting continuous debug message as
> >> > follows.
> >> >
> >> > 2013-08-08 12:57:36,628 DEBUG org.apache.hadoop.hbase.
> >> > master.SplitLogManager: total tasks = 14 unassigned = 14
> >> > 2013-08-08 12:57:37,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%
> 2Fmb-3.corp.oc.com
> >> > %252C60020%252C1375466447768.1375631802971
> >> > ver = 0
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%
> 2Fmb-6.corp.oc.com
> >> > %252C60020%252C1375466460755.1375623787557
> >> > ver = 0
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%
> 2Fmb-6.corp.oc.com
> >> > %252C60020%252C1375466460755.1375619231059
> >> > ver = 3
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-2.corp.oc.com%2C60020%2C1375466479427-splitting%
> 2Fmb-2.corp.oc.com
> >> > %252C60020%252C1375466479427.1375639017535
> >> > ver = 0
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%
> 2Fmb-6.corp.oc.com
> >> > %252C60020%252C1375466460755.1375623021175
> >> > ver = 0
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-3.corp.oc.com%2C60020%2C1375466447768-splitting%
> 2Fmb-3.corp.oc.com
> >> > %252C60020%252C1375466447768.1375630425141
> >> > ver = 0
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: resubmitting
> unassigned
> >> > task(s) after timeout
> >> > 2013-08-08 12:57:37,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%
> 2Fmb-6.corp.oc.com
> >> > %252C60020%252C1375466460755.1375620714514
> >> > ver = 3
> >> > 2013-08-08 12:57:37,630 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-6.corp.oc.com%2C60020%2C1375924525310-splitting%
> 2Fmb-6.corp.oc.com
> >> > %252C60020%252C1375924525310.1375924529658
> >> > ver = 0
> >> > 2013-08-08 12:57:37,630 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-4.corp.oc.com%2C60020%2C1375466551673-splitting%
> 2Fmb-4.corp.oc.com
> >> > %252C60020%252C1375466551673.1375641592581
> >> > ver = 0
> >> > 2013-08-08 12:57:37,630 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-5.corp.oc.com%2C60020%2C1375924528073-splitting%
> 2Fmb-5.corp.oc.com
> >> > %252C60020%252C1375924528073.1375924532442
> >> > ver = 0
> >> > 2013-08-08 12:57:37,630 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-6.corp.oc.com%2C60020%2C1375466460755-splitting%
> 2Fmb-6.corp.oc.com
> >> > %252C60020%252C1375466460755.1375622290167
> >> > ver = 3
> >> > 2013-08-08 12:57:37,630 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%
> 2Fmb-5.corp.oc.com
> >> > %252C60020%252C1375466463385.1375638183425
> >> > ver = 0
> >> > 2013-08-08 12:57:37,630 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%
> 2Fmb-5.corp.oc.com
> >> > %252C60020%252C1375466463385.1375639599559
> >> > ver = 0
> >> > 2013-08-08 12:57:37,630 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> >> > /hbase/splitlog/hdfs%3A%2F%2Fmb-1.corp.oc.com
> %3A54310%2Fhbase%2F.logs%
> >> > 2Fmb-5.corp.oc.com%2C60020%2C1375466463385-splitting%
> 2Fmb-5.corp.oc.com
> >> > %252C60020%252C1375466463385.1375641710787
> >> > ver = 3
> >> > 2013-08-08 12:57:37,633 INFO
> >> > org.apache.hadoop.hbase.master.SplitLogManager: task
> >> > /hbase/splitlog/RESCAN0000006975 entered state done mb-1.corp.oc.com
> >> > ,60000,1375924508669
> >> > 2013-08-08 12:57:37,633 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback:
> >> deleted
> >> > /hbase/splitlog/RESCAN0000006975
> >> > 2013-08-08 12:57:37,633 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: deleted task without
> in
> >> > memory state /hbase/splitlog/RESCAN0000006975
> >> > 2013-08-08 12:57:38,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:39,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:40,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:41,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:42,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:43,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:44,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:45,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:46,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:47,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:48,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:49,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:50,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:51,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:52,628 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:53,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:54,487 DEBUG
> >> >
> >> >
> >>
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> >> > Lookedup root region location,
> >> >
> >> >
> >>
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@24ddb5c9
> >> > ;
> >> > serverName=
> >> > 2013-08-08 12:57:54,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:55,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:56,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:57,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:58,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:57:59,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> > 2013-08-08 12:58:00,629 DEBUG
> >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 14
> >> unassigned
> >> > = 14
> >> >
> >> >
> >> > The cluster is unresponsive. I cannot access 4242 port on any of the
> >> > cluster nodes.
> >> > When I try to run tsdb command "tsdb uig grep metrics .", i am getting
> >> > following error messages
> >> >   ERROR [main-EventThread] HBaseClient: The znode for the -ROOT-
> region
> >> > doesn't exist!
> >> >   ERROR [main-EventThread] HBaseClient: The znode for the -ROOT-
> region
> >> > doesn't exist!
> >> >
> >> > Could you please suggest me what I can do to stop it.
> >> >
> >> > Thanks in Advance.
> >> >
> >> > Regards,
> >> > OC.
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message