hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaojinchao <gaojinc...@huawei.com>
Subject Re: about TestRollingRestart
Date Wed, 25 May 2011 10:26:24 GMT
Root and Meta had assigned finally from logs.
I am not sure what 's up.
It seems that some exception for socket.
So ServerShutdownHandler couldn't finish and I try to reproduce it again(I think it is easy
)

2011-05-24 09:09:49,292 WARN  [RegionServer:3;C4C2.site,56262,1306199352333-EventThread] zookeeper.MetaNodeTracker(66):
Tried to reset META server location after seeing the completion of a new META assignment but
got an IOE
org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: Timed out (2147483647ms)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:332)
	at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364)
	at org.apache.hadoop.hbase.zookeeper.MetaNodeTracker.nodeDeleted(MetaNodeTracker.java:64)
	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:276)
	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2011-05-24 09:09:49,314 INFO  [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] regionserver.Store(494):
Renaming flushed file at hdfs://localhost:48674/user/root/-ROOT-/70236052/.tmp/4785152515400750744
to hdfs://localhost:48674/user/root/-ROOT-/70236052/info/8393650380237115731
2011-05-24 09:09:49,330 INFO  [RegionServer:3;C4C2.site,57417,1306199361353-EventThread] catalog.CatalogTracker(419):
Passed metaserver is null
2011-05-24 09:09:49,334 DEBUG [main] master.TestRollingRestart(322):

-----邮件原件-----
发件人: saint.ack@gmail.com [mailto:saint.ack@gmail.com] 代表 Stack
发送时间: 2011年5月24日 23:16
收件人: user@hbase.apache.org
主题: Re: about TestRollingRestart

Gao:

Check out our hudson build.  Look there for TestRollingRestart
failures.  Usually it passes.  See here
https://builds.apache.org/hudson/job/HBase-TRUNK/ for TRUNK.  And here
for branch: https://builds.apache.org/hudson/view/G-L/view/HBase/job/hbase-0.90/.

Can you figure what its waiting on in the below (TestRollingRestart is
a nice ugly test!)
St.Ack

On Tue, May 24, 2011 at 2:31 AM, Gaojinchao <gaojinchao@huawei.com> wrote:
> hbase.master.assignment.timeoutmonitor.timeout should be set higher in TestRollingRestart
case.
>
> It is killed sometimes when we run all case.
>
> This is my analysis,.
> Is there anyone who encounter?
>
> Logs:
> // reassigned root and meta when regionsever has hutdowed
> 2011-05-24 09:09:32,989 DEBUG [MASTER_META_SERVER_OPERATIONS-C4C2.site:57883-4] master.AssignmentManager(926):
Assigning region -ROOT-,,0.70236052 to C4C2.site,57417,1306199361353
> 2011-05-24 09:09:32,989 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher(252): master:57883-0x1301f8b88270014
Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, path=/hbase/root-region-server
> 2011-05-24 09:09:32,989 DEBUG [MASTER_META_SERVER_OPERATIONS-C4C2.site:57883-4] master.ServerManager(617):
New connection to C4C2.site,57417,1306199361353
> 2011-05-24 09:09:32,992 INFO  [PRI IPC Server handler 1 on 57417] regionserver.HRegionServer(2073):
Received request to open region: -ROOT-,,0.7023
>
> // reassigned root because of timeout.
> 2011-05-24 09:09:39,628 INFO  [C4C2.site:57883.timeoutMonitor] master.AssignmentManager$TimeoutMonitor(1676):
Regions in transition timed out:  -ROOT-,,0.70236052 state=OPENING, ts=1306199374115
> 2011-05-24 09:09:43,389 INFO  [C4C2.site:57883.timeoutMonitor] master.AssignmentManager$TimeoutMonitor(1676):
Regions in transition timed out:  .META.,,1.1028785192 state=OPENING, ts=1306199375803
> 2011-05-24 09:09:47,632 INFO  [C4C2.site:57883.timeoutMonitor] master.AssignmentManager$TimeoutMonitor(1676):
Regions in transition timed out:  -ROOT-,,0.70236052 state=OPENING, ts=1306199374115
>
> // some exception happened
> 2011-05-24 09:09:49,292 WARN  [RegionServer:3;C4C2.site,56262,1306199352333-EventThread]
zookeeper.MetaNodeTracker(66): Tried to reset META server location after seeing the completion
of a new META assignment but got an IOE
> org.apache.hadoop.hbase.NotAllMetaRegionsOnlineException: Timed out (2147483647ms)
>         at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMeta(CatalogTracker.java:332)
>         at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForMetaServerConnectionDefault(CatalogTracker.java:364)
>         at org.apache.hadoop.hbase.zookeeper.MetaNodeTracker.nodeDeleted(MetaNodeTracker.java:64)
>         at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:276)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
> 2011-05-24 09:09:49,314 INFO  [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] regionserver.Store(494):
Renaming flushed file at hdfs://localhost:48674/user/root/-ROOT-/70236052/.tmp/4785152515400750744
to hdfs://localhost:48674/user/root/-ROOT-/70236052/info/8393650380237115731
> 2011-05-24 09:09:49,330 INFO  [RegionServer:3;C4C2.site,57417,1306199361353-EventThread]
catalog.CatalogTracker(419): Passed metaserver is null
> 2011-05-24 09:09:49,334 DEBUG [main] master.TestRollingRestart(322):
> TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting
>
> 2011-05-24 09:09:49,349 INFO  [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] regionserver.Store(504):
Added hdfs://localhost:48674/user/root/-ROOT-/70236052/info/8393650380237115731, entries=4,
sequenceid=86, memsize=928.0, filesize=612.0
> 2011-05-24 09:09:49,350 INFO  [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] regionserver.HRegion(1022):
Finished memstore flush of ~928.0 for region -ROOT-,,0.70236052 in 128ms, sequenceid=86, compaction
requested=false
> 2011-05-24 09:09:49,351 DEBUG [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] regionserver.Store(418):
closed info
> 2011-05-24 09:09:49,352 INFO  [RS_OPEN_ROOT-C4C2.site,57417,1306199361353-0] regionserver.HRegion(550):
Closed -ROOT-,,0.70236052
> 2011-05-24 09:09:49,363 INFO  [RegionServer:3;C4C2.site,56262,1306199352333.compactor]
regionserver.Store(737): Completed major compaction of 4 file(s), new file=hdfs://localhost:48674/user/root/-ROOT-/70236052/info/3975431265129056699,
size=1.0k; total size for store is 1.0k
> 2011-05-24 09:09:49,363 INFO  [RegionServer:3;C4C2.site,56262,1306199352333.compactor]
regionserver.HRegion(782): completed compaction on region -ROOT-,,0.70236052 after 0sec
> 2011-05-24 09:09:49,434 DEBUG [main] master.TestRollingRestart(322):
>
>
>
> // waited forever ........
> TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting
>
> 2011-05-24 09:19:43,104 DEBUG [main] master.TestRollingRestart(322):
>
> TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting
>
> 2011-05-24 09:19:43,204 DEBUG [main] master.TestRollingRestart(322):
>
> TRR: Server [C4C2.site,45066,1306199358380] still being processed, waiting
>
Mime
View raw message