hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: AssignmentManager looping?
Date Thu, 01 Aug 2013 18:07:42 GMT
No,it's a HBase 0.94.10 cluster with Hadoop 1.0.3, everything installed
manually from JARs ;) It's a mess to monitor and I would have loved to have
it under CM now, but I have to deal with that ;)

I'm building a 2nd cluster at home so I will be able to replicate this one
to the other one, which might allow me to play even further with it...

I will try to reproduce the issue, give me just couple of hours...

JM

2013/8/1 Kevin O'dell <kevin.odell@cloudera.com>

> Jimmy,
>
>   Sounds like our dreaded reference file issue again. I spoke with JM and
> he is going to try to reproduce this  My gut tells me our point of no
> return may be in the wrong place due to some code change along the way, but
> hbck could also just be doing something wonky.
>
> JM,
>
>   This cluster is not CM managed correct?
> On Aug 1, 2013 1:49 PM, "Jean-Marc Spaggiari" <jean-marc@spaggiari.org>
> wrote:
>
> > So I had to remove few reference files and run few hbck to get everything
> > back online.
> >
> > Summary: don't stop your cluster while it's major compacting huge tables
> ;)
> >
> > Thanks all!
> >
> > JM
> >
> > 2013/8/1 Kevin O'dell <kevin.odell@cloudera.com>
> >
> > > If that doesn't work you probably have an invalid reference file and
> you
> > > will find that in RS logs for the HLog split that is never finishing.
> > > On Aug 1, 2013 1:38 PM, "Kevin O'dell" <kevin.odell@cloudera.com>
> wrote:
> > >
> > > > JM,
> > > >
> > > > Stop HBase
> > > > rmr /hbase from zkcli
> > > > Sideline META
> > > > Run offline meta repair
> > > > Start HBase
> > > > On Aug 1, 2013 1:01 PM, "Jean-Marc Spaggiari" <
> jean-marc@spaggiari.org
> > >
> > > > wrote:
> > > >
> > > >> Hi Jimmy,
> > > >>
> > > >> I should still have all the logs.
> > > >>
> > > >> What I did is pretty simple.
> > > >>
> > > >> I tried to turn the cluster off while a single regioned 250GB table
> > was
> > > >> under major_compaction to get splitted.
> > > >>
> > > >> I will targz all the logs for the few last days and make that
> > available.
> > > >>
> > > >> On the other side, I'm still not able to bring it back up...
> > > >>
> > > >> JM
> > > >>
> > > >> 2013/8/1 Jimmy Xiang <jxiang@cloudera.com>
> > > >>
> > > >> > Something went wrong with split.  It should be easy to fix your
> > > cluster.
> > > >> > However, it will be more interesting to find out how it happened.
> Do
> > > you
> > > >> > remember what has happened since it was good previously? Do you
> have
> > > all
> > > >> > the logs?
> > > >> >
> > > >> >
> > > >> > On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari <
> > > >> > jean-marc@spaggiari.org
> > > >> > > wrote:
> > > >> >
> > > >> > > I tried to remove the znodes but got the same result. So
I
> shutted
> > > >> down
> > > >> > all
> > > >> > > the RS and restarted HBase, and now I have 0 regions for
this
> > table.
> > > >> > > Running HBCK. Seems that it has a lot to do...
> > > >> > >
> > > >> > > 2013/8/1 Kevin O'dell <kevin.odell@cloudera.com>
> > > >> > >
> > > >> > > > Yes you can if HBase is down, first I would copy .META
out of
> > HDFS
> > > >> > local
> > > >> > > > and then you can search it for split issues. Deleting
those
> > znodes
> > > >> > should
> > > >> > > > clear this up though.
> > > >> > > > On Aug 1, 2013 8:52 AM, "Jean-Marc Spaggiari" <
> > > >> jean-marc@spaggiari.org
> > > >> > >
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > > > I can't check the meta since HBase is down.
> > > >> > > > >
> > > >> > > > > Regarding HDFS, I took few random lines like:
> > > >> > > > > 2013-08-01 08:45:57,260 WARN
> > > >> > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > 28328fdb7181cbd9cc4d6814775e8895 not found on
server
> > > >> > > > > node4,60020,1375319042033; failed processing
> > > >> > > > > 2013-08-01 08:45:57,260 WARN
> > > >> > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Received
> > SPLIT
> > > >> for
> > > >> > > > region
> > > >> > > > > 28328fdb7181cbd9cc4d6814775e8895 from server
> > > >> > node4,60020,1375319042033
> > > >> > > > but
> > > >> > > > > it doesn't exist anymore, probably already processed
its
> split
> > > >> > > > >
> > > >> > > > > And each time, there is nothing like that.
> > > >> > > > > hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr
/ | grep
> > > >> > > > > 28328fdb7181cbd9cc4d6814775e8895
> > > >> > > > >
> > > >> > > > > On ZK side:
> > > >> > > > > [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
> > > >> > > > >
> > > >> > > > > [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
> > > >> > > > > [28328fdb7181cbd9cc4d6814775e8895,
> > > >> a8781a598c46f19723a2405345b58470,
> > > >> > > > > b7ebfeb63b10997736fd12920fde2bb8,
> > > >> d95bb27cc026511c2a8c8ad155e79bf6,
> > > >> > > > > 270a9c371fcbe9cd9a04986e0b77d16b,
> > > >> aff4d1d8bf470458bb19525e8aef0759]
> > > >> > > > >
> > > >> > > > > Can I just delete those zknodes? Worst case hbck
will find
> > them
> > > >> back
> > > >> > > from
> > > >> > > > > HDFS if required?
> > > >> > > > >
> > > >> > > > > JM
> > > >> > > > >
> > > >> > > > > 2013/8/1 Kevin O'dell <kevin.odell@cloudera.com>
> > > >> > > > >
> > > >> > > > > > Does it exist in meta or hdfs?
> > > >> > > > > > On Aug 1, 2013 8:24 AM, "Jean-Marc Spaggiari"
<
> > > >> > > jean-marc@spaggiari.org
> > > >> > > > >
> > > >> > > > > > wrote:
> > > >> > > > > >
> > > >> > > > > > > My master keep logging that:
> > > >> > > > > > >
> > > >> > > > > > > 2013-07-31 21:52:59,201 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 21:52:59,201 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 21:52:59,339 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 21:52:59,339 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 21:52:59,461 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 21:52:59,461 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 21:52:59,636 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 21:52:59,636 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 21:53:00,074 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 21:53:00,074 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 21:53:00,261 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 21:53:00,261 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 21:53:00,417 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 21:53:00,417 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > >
> > > >> > > > > > > hbase@node3:~/hbase-0.94.3$ cat
> > > >> > > logs/hbase-hbase-master-node3.log* |
> > > >> > > > > > grep
> > > >> > > > > > > "Region 270a9c371fcbe9cd9a04986e0b77d16b
not found " |
> wc
> > > >> > > > > > >    5042   65546  927728
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > > > Then crashed.
> > > >> > > > > > > 2013-07-31 22:22:46,072 FATAL
> > > >> > > org.apache.hadoop.hbase.master.HMaster:
> > > >> > > > > > > Master server abort: loaded coprocessors
are: []
> > > >> > > > > > > 2013-07-31 22:22:46,073 FATAL
> > > >> > > org.apache.hadoop.hbase.master.HMaster:
> > > >> > > > > > > Unexpected state :
> > > work_proposed,\x02\xE8\x92'\x00\x00\x00\x00
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://video.inportnews.ca/search/all/source/sun-news-network/harry-potter-in-translation/68463493001/page/1526,1375307272709.d95bb27cc026511c2a8c8ad155e79bf6
> > > >> > > > > > .
> > > >> > > > > > > state=OPENING, ts=1375323766008,
> > > >> server=node7,60020,1375319044055
> > > >> > > ..
> > > >> > > > > > > Cannot
> > > >> > > > > > > transit it to OFFLINE.
> > > >> > > > > > > java.lang.IllegalStateException: Unexpected
state :
> > > >> > > > > > > work_proposed,\x02\xE8\x92'\x00\x00\x00\x00
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://video.inportnews.ca/search/all/source/sun-news-network/harry-potter-in-translation/68463493001/page/1526,1375307272709.d95bb27cc026511c2a8c8ad155e79bf6
> > > >> > > > > > .
> > > >> > > > > > > state=OPENING, ts=1375323766008,
> > > >> server=node7,60020,1375319044055
> > > >> > > ..
> > > >> > > > > > > Cannot
> > > >> > > > > > > transit it to OFFLINE.
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > >
> > > >> > >
> > > >>
> > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> > > >> > > > > > >     at java.lang.Thread.run(Thread.java:722)
> > > >> > > > > > > 2013-07-31 22:22:46,075 INFO
> > > >> > > org.apache.hadoop.hbase.master.HMaster:
> > > >> > > > > > > Aborting
> > > >> > > > > > > 2013-07-31 22:22:46,075 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > > > > Stopping
> > > >> > > > > > > server on 60000
> > > >> > > > > > > 2013-07-31 22:22:46,075 INFO
> > > >> > > > org.apache.hadoop.hbase.master.HMaster$2:
> > > >> > > > > > > node3,60000,1375322220614-BalancerChore
exiting
> > > >> > > > > > > 2013-07-31 22:22:46,075 INFO
> > > >> > > > > > org.apache.hadoop.hbase.master.CatalogJanitor:
> > > >> > > > > > > node3,60000,1375322220614-CatalogJanitor
exiting
> > > >> > > > > > > 2013-07-31 22:22:46,076 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > > > > Stopping
> > > >> > > > > > > IPC Server listener on 60000
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 9 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 2 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 4 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 8 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,076 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 6 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,076 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > > REPL
> > > >> > > > > IPC
> > > >> > > > > > > Server handler 2 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,076 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > > REPL
> > > >> > > > > IPC
> > > >> > > > > > > Server handler 1 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,076 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > > REPL
> > > >> > > > > IPC
> > > >> > > > > > > Server handler 0 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 3 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,076 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 0 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> > > > > > > org.apache.hadoop.hbase.master.cleaner.HFileCleaner:
> > > >> > > > > > > master-node3,60000,1375322220614.archivedHFileCleaner
> > > exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> > > > > > > org.apache.hadoop.hbase.master.cleaner.LogCleaner:
> > > >> > > > > > > master-node3,60000,1375322220614.oldLogCleaner
exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> > > org.apache.hadoop.hbase.master.HMaster:
> > > >> > > > > > > Stopping infoServer
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > > > > Stopping
> > > >> > > > > > > IPC Server Responder
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 5 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 7 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > IPC
> > > >> > > > > > Server
> > > >> > > > > > > handler 1 on 60000: exiting
> > > >> > > > > > > 2013-07-31 22:22:46,077 INFO
> > > >> org.apache.hadoop.ipc.HBaseServer:
> > > >> > > > > Stopping
> > > >> > > > > > > IPC Server Responder
> > > >> > > > > > > 2013-07-31 22:22:46,078 INFO org.mortbay.log:
Stopped
> > > >> > > > > > > SelectChannelConnector@0.0.0.0:60010
> > > >> > > > > > > 2013-07-31 22:22:46,127 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 22:22:46,127 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 22:22:46,181 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > aff4d1d8bf470458bb19525e8aef0759 not
found on server
> > > >> > > > > > > node2,60020,1375319046072; failed processing
> > > >> > > > > > > 2013-07-31 22:22:46,181 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > aff4d1d8bf470458bb19525e8aef0759 from
server
> > > >> > > > node2,60020,1375319046072
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 22:22:46,193 ERROR
> > > >> > > > > > > org.apache.hadoop.hbase.executor.ExecutorService:
Cannot
> > > >> submit
> > > >> > > > > > > [ClosedRegionHandler-node3,60000,1375322220614-179]
> > because
> > > >> the
> > > >> > > > > executor
> > > >> > > > > > is
> > > >> > > > > > > missing. Is this process shutting down?
> > > >> > > > > > > 2013-07-31 22:22:46,250 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 28328fdb7181cbd9cc4d6814775e8895 not
found on server
> > > >> > > > > > > node4,60020,1375319042033; failed processing
> > > >> > > > > > > 2013-07-31 22:22:46,250 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 28328fdb7181cbd9cc4d6814775e8895 from
server
> > > >> > > > node4,60020,1375319042033
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 22:22:46,262 INFO
> > > >> > > > > > >
> > > org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor:
> > > >> > > > > > > node3,60000,1375322220614.splitLogManagerTimeoutMonitor
> > > >> exiting
> > > >> > > > > > > 2013-07-31 22:22:46,293 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b not
found on server
> > > >> > > > > > > node7,60020,1375319044055; failed processing
> > > >> > > > > > > 2013-07-31 22:22:46,293 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > 270a9c371fcbe9cd9a04986e0b77d16b from
server
> > > >> > > > node7,60020,1375319044055
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 22:22:46,294 INFO
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
> > > >> > > > > > > Closed zookeeper sessionid=0x240024f5666144b
> > > >> > > > > > > 2013-07-31 22:22:46,361 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
Region
> > > >> > > > > > > aff4d1d8bf470458bb19525e8aef0759 not
found on server
> > > >> > > > > > > node2,60020,1375319046072; failed processing
> > > >> > > > > > > 2013-07-31 22:22:46,362 WARN
> > > >> > > > > > > org.apache.hadoop.hbase.master.AssignmentManager:
> Received
> > > >> SPLIT
> > > >> > > for
> > > >> > > > > > region
> > > >> > > > > > > aff4d1d8bf470458bb19525e8aef0759 from
server
> > > >> > > > node2,60020,1375319046072
> > > >> > > > > > but
> > > >> > > > > > > it doesn't exist anymore, probably already
processed its
> > > split
> > > >> > > > > > > 2013-07-31 22:22:46,388 INFO
> > > >> > > > > > >
> > > >> org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
> > > >> > > > > > > node3,60000,1375322220614.timeoutMonitor
exiting
> > > >> > > > > > > 2013-07-31 22:22:46,388 INFO
> > > >> > > > > > >
> > > org.apache.hadoop.hbase.master.AssignmentManager$TimerUpdater:
> > > >> > > > > > > node3,60000,1375322220614.timerUpdater
exiting
> > > >> > > > > > > 2013-07-31 22:22:46,402 INFO
> > > >> > > org.apache.hadoop.hbase.master.HMaster:
> > > >> > > > > > > HMaster main thread exiting
> > > >> > > > > > > 2013-07-31 22:22:46,402 ERROR
> > > >> > > > > > > org.apache.hadoop.hbase.master.HMasterCommandLine:
> Failed
> > to
> > > >> > start
> > > >> > > > > master
> > > >> > > > > > > java.lang.RuntimeException: HMaster
Aborted
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:160)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:104)
> > > >> > > > > > >     at
> > > >> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >> > > > > > >     at
> > > >> > > > > > >
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> > > >> > > > > > >     at
> > > >> > > org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2100)
> > > >> > > > > > >
> > > >> > > > > > > Seems that HBCK can't do anything. I
will start to look
> at
> > > the
> > > >> > > files
> > > >> > > > > into
> > > >> > > > > > > HDFS, but suggestions are welcome.
> > > >> > > > > > >
> > > >> > > > > > > JM
> > > >> > > > > > >
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message