hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Davies <matt.dav...@tynt.com>
Subject Master died after failed assignment of regionserver
Date Mon, 27 Jun 2011 16:54:28 GMT
Hey all,

Interesting problem I'm running into.  It appears that our master has died
after a failed region assignment.  Any ideas as to why?


We are running version 0.90.1 in this cluster.
H3M1 = master
H3S* = regionservers

out on H3S3 the regionserver process is running (but didn't respond per the
log)

2011-06-27 10:18:22,517 INFO
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Attemping to
disable table ContentGuidance
2011-06-27 10:18:22,522 INFO
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Offlining 1
regions.
2011-06-27 10:18:23,527 INFO
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Disabled table
is done=true
2011-06-27 10:18:24,536 INFO
org.apache.hadoop.hbase.master.handler.TableEventHandler: Handling table
operation C_M_DELETE_TABLE on table ContentGuidance
2011-06-27 10:18:24,584 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
Deleted region
ContentGuidance,,1298676132327.9d63b6173c7307c27aa9728f738f0ef3. from META
2011-06-27 10:19:17,296 WARN org.apache.hadoop.hbase.zookeeper.ZKTable:
Moving table ContentGuidance state to enabled but was already enabled
2011-06-27 10:19:17,313 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
HLog configuration: blocksize=64 MB, rollsize=60.8 MB, enabled=true,
flushlogentries=1, optionallogflushinternal=1000ms
2011-06-27 10:19:17,349 INFO
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter: Using syncFs
-- HDFS-200
2011-06-27 10:19:17,350 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
New hlog
/hbase/ContentGuidance/f38bbc9bc672b5ea415b6dcc7e1da43a/.logs/hlog.1309191557313
2011-06-27 10:19:17,350 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
Using getNumCurrentReplicas--HDFS-826
2011-06-27 10:19:17,433 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Onlined ContentGuidance,,1309191557292.f38bbc9bc672b5ea415b6dcc7e1da43a.;
next sequenceid=1
2011-06-27 10:19:17,438 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
Added region
ContentGuidance,,1309191557292.f38bbc9bc672b5ea415b6dcc7e1da43a. to META
2011-06-27 10:19:17,439 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Closed ContentGuidance,,1309191557292.f38bbc9bc672b5ea415b6dcc7e1da43a.
2011-06-27 10:19:17,439 INFO org.apache.hadoop.hbase.regionserver.wal.HLog:
IPC Server handler 87 on 60000.logSyncer exiting
2011-06-27 10:19:17,470 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning 1 region(s)
round-robin across 4 server(s)
2011-06-27 10:19:17,472 INFO
org.apache.hadoop.hbase.master.AssignmentManager: Bulk assigning done
2011-06-27 10:19:17,480 INFO
org.apache.hadoop.hbase.master.AssignmentManager: H3S4,60020,1308946657703
unassigned znodes=1 of total=1
2011-06-27 10:19:21,530 INFO
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Attemping to
disable table ContentGuidance
2011-06-27 10:19:21,536 INFO
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Offlining 1
regions.
2011-06-27 10:19:22,543 INFO
org.apache.hadoop.hbase.master.handler.DisableTableHandler: Disabled table
is done=true
2011-06-27 10:19:23,544 INFO
org.apache.hadoop.hbase.master.handler.TableEventHandler: Handling table
operation C_M_DELETE_TABLE on table ContentGuidance
2011-06-27 10:19:23,551 INFO org.apache.hadoop.hbase.catalog.MetaEditor:
Deleted region
ContentGuidance,,1309191557292.f38bbc9bc672b5ea415b6dcc7e1da43a. from META
2011-06-27 10:20:17,479 FATAL org.apache.hadoop.hbase.master.HMaster: Failed
assignment of regions to serverName=H3S3,60020,1308946657608,
load=(requests=0, regions=7, usedHeap=976, maxHeap=1011)
java.net.SocketTimeoutException: Call to H3S3/10..x.x.x3:60020 failed on
socket timeout exception: java.net.SocketTimeoutException: 60000 millis
timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/10..x.x.x:34904
remote=H3S3/10.x.x.x:60020]
at
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:802)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:775)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy7.openRegions(Unknown Source)
at
org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:566)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:776)
at
org.apache.hadoop.hbase.master.AssignmentManager$SingleServerBulkAssigner.run(AssignmentManager.java:1310)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.net.SocketTimeoutException: 60000 millis timeout while
waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/10..x.x.x:34904
remote=H3S3/10..x.x.x:60020]
at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.FilterInputStream.read(FilterInputStream.java:116)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:299)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readInt(DataInputStream.java:370)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:539)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:477)
2011-06-27 10:20:17,482 INFO org.apache.hadoop.hbase.master.HMaster:
Aborting
2011-06-27 10:20:18,390 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60000
2011-06-27 10:20:18,390 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
H3M1:60000-CatalogJanitor exiting
2011-06-27 10:20:18,390 INFO org.apache.hadoop.hbase.master.HMaster$1:
H3M1:60000-BalancerChore exiting
2011-06-27 10:20:18,392 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server listener on 60000

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message