hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: Borked Splitlog
Date Mon, 16 May 2011 09:07:16 GMT
I am still stuck with this cluster not starting again, I know it is
all local and such, therefore not really representative, but this
ought to work, no? See this log I get at startup:

2011-05-16 11:00:36,834 INFO
org.apache.hadoop.hbase.regionserver.SplitLogWorker: SplitLogWorker
10.0.0.64,60020,1305536432387 starting
2011-05-16 11:00:36,838 INFO
org.apache.hadoop.hbase.regionserver.StoreFile: Allocating
LruBlockCache with maximum size 197.5m
2011-05-16 11:00:36,850 INFO
org.apache.hadoop.hbase.regionserver.SplitLogWorker: successfully
transitioned task /hbase/splitlog/RESCAN0000234067 to final state done
2011-05-16 11:00:36,852 DEBUG
org.apache.hadoop.hbase.regionserver.SplitLogWorker: tasks arrived or
departed
2011-05-16 11:00:36,854 INFO
org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker
10.0.0.64,60020,1305536432387 acquired task
/hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:00:36,871 DEBUG
org.apache.hadoop.hbase.monitoring.MonitoredTask: setDescritption:
Splitting log file
hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389into
a temporary staging area.
2011-05-16 11:00:36,874 INFO
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog:
hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389,
length=16173236224
2011-05-16 11:00:36,874 DEBUG
org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: Opening
log file
2011-05-16 11:00:36,875 INFO org.apache.hadoop.hbase.util.FSUtils:
Recovering file
hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389
2011-05-16 11:00:37,415 WARN
org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error
detected. Found 1 replicas but expecting 3 replicas.  Requesting close
of hlog.
2011-05-16 11:00:37,876 INFO org.apache.hadoop.hbase.util.FSUtils:
Finished lease recover attempt for
hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389
2011-05-16 11:00:38,073 INFO
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: This region's
directory doesn't exist:
hdfs://localhost:8020/hbase/usertable/30c4d0a47703214845d0676d0c7b36f0.
It is very likely that it was already split so it's safe to discard
those edits.
2011-05-16 11:00:38,074 INFO
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: processed 0
edits across 0 regions threw away edits for 1 regions log file =
hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389
is corrupted = false
2011-05-16 11:00:38,074 DEBUG
org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: processed
0 edits across 0 regions threw away edits for 1 regions log file =
hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389
is corrupted = false
2011-05-16 11:00:38,074 DEBUG
org.apache.hadoop.hbase.monitoring.MonitoredTask: markComplete:
processed 0 edits across 0 regions threw away edits for 1 regions log
file = hdfs://localhost/hbase/.logs/10.0.0.65,60020,1305406356765/10.0.0.65%2C60020%2C1305406356765.1305409968389
is corrupted = false
2011-05-16 11:00:38,074 INFO
org.apache.hadoop.hbase.regionserver.SplitLogWorker: worker
10.0.0.64,60020,1305536432387 done with task
/hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
in 1217ms
2011-05-16 11:00:38,825 INFO
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager:
Moving 10.0.0.64,60020,1305535848569's hlogs to my queue

==> /var/lib/hbase/logs/hbase-larsgeorge-5-master-de1-app-mbp-2.log <==
2011-05-16 11:00:41,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:42,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:43,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:44,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:45,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:46,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:47,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:48,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:49,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:50,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:51,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:52,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:53,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:54,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:55,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:56,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:57,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:58,691 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:00:59,692 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:01:00,692 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:01:01,692 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:01:02,692 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:01:03,692 INFO
org.apache.hadoop.hbase.master.SplitLogManager: resubmitting task
/hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:03,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 0
2011-05-16 11:01:03,693 INFO
org.apache.hadoop.hbase.master.SplitLogManager: resubmitted 1 out of 1
tasks
2011-05-16 11:01:03,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
/hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
ver = 28
2011-05-16 11:01:03,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
/hbase/splitlog/RESCAN0000234069 ver = 0
2011-05-16 11:01:04,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:04,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:05,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:05,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:06,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:06,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:07,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:07,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:08,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:08,693 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:09,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:09,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:10,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:10,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:11,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:11,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:12,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:12,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:13,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:13,694 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:14,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:14,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:15,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:15,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:16,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:16,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:17,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:17,695 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:18,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:18,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:19,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:19,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:20,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:20,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:21,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:21,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:22,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:22,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:23,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:23,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:24,697 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:24,697 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:25,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:25,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:26,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:26,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:27,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:27,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:28,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:28,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:29,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389
2011-05-16 11:01:29,697 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
unassigned = 1
2011-05-16 11:01:30,696 DEBUG
org.apache.hadoop.hbase.master.SplitLogManager: chore: unassigned task
path -> /hbase/splitlog/hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389

I hacked the code to have the SplitLogManager delete all orphaned
RESCAN znodes, as I ended up having hundreds of them, and there seems
to be no way to "delete *" them, right? Is there a trick to be able to
delete a non-empty node in zkCli?

Anyhow, the split is supposedly done, or the task at least reports as
complete, then the replication ReplicationSourceManager kicks in, and
then the task gets relisted over and over again. Just after a few
minutes you see this in ZK's /hbase/splitlogs:

[RESCAN0000234200, RESCAN0000234209, RESCAN0000234207,
RESCAN0000234208, RESCAN0000234205, RESCAN0000234206,
RESCAN0000234203, RESCAN0000234204, RESCAN0000234201,
RESCAN0000234202, RESCAN0000234237, RESCAN0000234236,
RESCAN0000234235, RESCAN0000234234, RESCAN0000234239,
RESCAN0000234238, RESCAN0000234232, RESCAN0000234233,
RESCAN0000234230, RESCAN0000234231, RESCAN0000234219,
RESCAN0000234218, RESCAN0000234217, RESCAN0000234216,
RESCAN0000234215, RESCAN0000234214, RESCAN0000234213,
RESCAN0000234212, RESCAN0000234210, RESCAN0000234211,
RESCAN0000234228, RESCAN0000234227, RESCAN0000234229,
RESCAN0000234224, RESCAN0000234223, RESCAN0000234226,
RESCAN0000234225, RESCAN0000234220, RESCAN0000234221,
RESCAN0000234222, RESCAN0000234100, RESCAN0000234101,
RESCAN0000234107, RESCAN0000234106, RESCAN0000234109,
RESCAN0000234108, RESCAN0000234103, RESCAN0000234102,
RESCAN0000234105, RESCAN0000234104, RESCAN0000234111,
RESCAN0000234112, RESCAN0000234110, RESCAN0000234116,
RESCAN0000234115, RESCAN0000234114, RESCAN0000234113,
RESCAN0000234119, RESCAN0000234118, RESCAN0000234117,
RESCAN0000234120, RESCAN0000234121, RESCAN0000234122,
RESCAN0000234123, RESCAN0000234125, RESCAN0000234124,
RESCAN0000234127, RESCAN0000234126, RESCAN0000234129,
RESCAN0000234128, RESCAN0000234134, RESCAN0000234133,
RESCAN0000234132, RESCAN0000234131, RESCAN0000234130,
RESCAN0000234139, RESCAN0000234137, RESCAN0000234138,
RESCAN0000234135, RESCAN0000234136, RESCAN0000234143,
RESCAN0000234142, RESCAN0000234145, RESCAN0000234144,
RESCAN0000234141, RESCAN0000234140, RESCAN0000234146,
RESCAN0000234147, RESCAN0000234148, RESCAN0000234149,
RESCAN0000234152, RESCAN0000234151, RESCAN0000234150,
RESCAN0000234156, RESCAN0000234155, RESCAN0000234154,
RESCAN0000234153, RESCAN0000234159, RESCAN0000234157,
RESCAN0000234158, RESCAN0000234161, RESCAN0000234160,
RESCAN0000234163, RESCAN0000234162, RESCAN0000234165,
RESCAN0000234164, RESCAN0000234167, RESCAN0000234166,
RESCAN0000234168, RESCAN0000234169, RESCAN0000234179,
RESCAN0000234175, RESCAN0000234176, RESCAN0000234177,
RESCAN0000234178, RESCAN0000234171, RESCAN0000234172,
RESCAN0000234173, RESCAN0000234174, RESCAN0000234170,
RESCAN0000234188, RESCAN0000234189, RESCAN0000234186,
RESCAN0000234187, RESCAN0000234184, RESCAN0000234185,
RESCAN0000234182, RESCAN0000234183, RESCAN0000234180,
RESCAN0000234181, RESCAN0000234193, RESCAN0000234194,
RESCAN0000234195, RESCAN0000234196, RESCAN0000234197,
RESCAN0000234198, RESCAN0000234199, RESCAN0000234190,
RESCAN0000234191, RESCAN0000234192, RESCAN0000234070,
RESCAN0000234071, RESCAN0000234072, RESCAN0000234073,
RESCAN0000234074, RESCAN0000234075, RESCAN0000234076,
RESCAN0000234077, RESCAN0000234078, RESCAN0000234079,
RESCAN0000234081, RESCAN0000234082, RESCAN0000234080,
RESCAN0000234085, RESCAN0000234086, RESCAN0000234083,
RESCAN0000234084, RESCAN0000234089, RESCAN0000234087,
RESCAN0000234088, RESCAN0000234069, RESCAN0000234099,
RESCAN0000234098, RESCAN0000234095, RESCAN0000234094,
RESCAN0000234097, RESCAN0000234096, RESCAN0000234091,
RESCAN0000234090, RESCAN0000234093, RESCAN0000234092,
hdfs%3A%2F%2Flocalhost%2Fhbase%2F.logs%2F10.0.0.65%2C60020%2C1305406356765%2F10.0.0.65%252C60020%252C1305406356765.1305409968389]

After that all is stuck. Ideas?

On Mon, May 16, 2011 at 7:03 AM, Lars George <lars.george@gmail.com> wrote:
> Hi,
>
> I am on trunk and testing in pseudo distributed setup. I loaded the
> machine with YCSB and got it to break at a few million inserts during
> the load phase with the GC taking too long and the compaction queue
> going through the roof subsequently. Since then I cannot recover the
> local "cluster". It is stuck printing this:
>
> ...
> 2011-05-16 06:59:05,389 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/RESCAN0000148501 ver = 0
> 2011-05-16 06:59:06,388 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
> unassigned = 1
> 2011-05-16 06:59:06,389 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting
> unassigned task(s) after timeout
> 2011-05-16 06:59:06,390 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/RESCAN0000148502 ver = 0
> 2011-05-16 06:59:07,388 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
> unassigned = 1
> 2011-05-16 06:59:07,388 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting
> unassigned task(s) after timeout
> 2011-05-16 06:59:07,389 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/RESCAN0000148503 ver = 0
> 2011-05-16 06:59:08,388 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
> unassigned = 1
> 2011-05-16 06:59:08,388 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting
> unassigned task(s) after timeout
> 2011-05-16 06:59:08,389 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/RESCAN0000148504 ver = 0
> 2011-05-16 06:59:09,388 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 1
> unassigned = 1
> 2011-05-16 06:59:09,389 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: resubmitting
> unassigned task(s) after timeout
> 2011-05-16 06:59:09,390 DEBUG
> org.apache.hadoop.hbase.master.SplitLogManager: task not yet acquired
> /hbase/splitlog/RESCAN0000148505 ver = 0
> ...
>
> This keeps on going up and up. What is the right way to recover from
> this? Delete something from ZK? Delete something from HDFS? What shell
> commands would help?
>
> Thanks,
> Lars
>

Mime
View raw message