hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "11 Nov." <nov.eleve...@gmail.com>
Subject Re: Region Server lost response when doing BatchUpdate
Date Tue, 14 Apr 2009 10:03:55 GMT
Hi all,
    The insert operation is still executing, but there are region servers
getting down now and then. The log info shows that they are shutdown for
different reasons. Here is another failed region server's log:

2009-04-14 16:17:08,718 INFO org.apache.hadoop.hbase.regionserver.HLog:
removing old log file
/hbase/log_192.168.33.213_1239694262099_62020/hlog.dat.1239696959813 whose
highest sequence/edit id is 122635282
2009-04-14 16:17:14,932 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException: Not
replicated
yet:/hbase/log_192.168.33.213_1239694262099_62020/hlog.dat.1239697028652
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
    at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)

    at org.apache.hadoop.ipc.Client.call(Client.java:697)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at $Proxy1.addBlock(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.addBlock(Unknown Source)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2823)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2705)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2182)

2009-04-14 16:17:14,932 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping
/hbase/log_192.168.33.213_1239694262099_62020/hlog.dat.1239697028652 retries
left 4
2009-04-14 16:17:15,499 INFO org.apache.hadoop.hbase.regionserver.HLog:
Closed
hdfs://compute-11-5.local:11004/hbase/log_192.168.33.213_1239694262099_62020/hlog.dat.1239697021646,
entries=100003. New log writer:
/hbase/log_192.168.33.213_1239694262099_62020/hlog.dat.1239697035433

.................................


2009-04-14 17:18:44,259 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping
/hbase/log_192.168.33.213_1239694262099_62020/hlog.dat.1239700723643 retries
left 4
2009-04-14 17:18:44,663 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.NotReplicatedYetException: Not
replicated
yet:/hbase/log_192.168.33.213_1239694262099_62020/hlog.dat.1239700723643
    at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1266)
    at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:351)
    at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)

    at org.apache.hadoop.ipc.Client.call(Client.java:697)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
    at $Proxy1.addBlock(Unknown Source)
    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy1.addBlock(Unknown Source)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2823)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2705)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:1996)
    at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2182)

2009-04-14 17:18:44,663 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping


   There are 8 cores on each node, and we configured 4 map tasks to run
simultaneously. Are we running at too high comcurrent rate?


2009/4/14 11 Nov. <nov.eleventh@gmail.com>

> hi JD,
>     I tried your solution by upgrading hbase to 0.19.1 and applying the
> patch. The inserting mapreduce application has been running for more than
> half an hour, we lost one region server and here is the log on the lost
> region server:
>
> 2009-04-14 16:08:11,483 FATAL
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay of hlog
> required. Forcing server shutdown
> org.apache.hadoop.hbase.DroppedSnapshotException: region:
> CDR,000220285104,1239696381168
>     at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:897)
>     at
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:790)
>     at
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.flushRegion(MemcacheFlusher.java:228)
>     at
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.run(MemcacheFlusher.java:138)
> Caused by: java.lang.ClassCastException: [B cannot be cast to
> org.apache.hadoop.hbase.HStoreKey
>     at
> org.apache.hadoop.hbase.regionserver.HStore.internalFlushCache(HStore.java:679)
>     at
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:636)
>     at
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:882)
>     ... 3 more
> 2009-04-14 16:08:11,553 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0.0, regions=13, stores=13, storefiles=63, storefileIndexSize=6,
> memcacheSize=206, usedHeap=631, maxHeap=4991
> 2009-04-14 16:08:11,553 INFO
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher:
> regionserver/0:0:0:0:0:0:0:0:62020.cacheFlusher exiting
> 2009-04-14 16:08:12,502 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020, call batchUpdates([B@7075ae,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@573df2bb) from
> 192.168.33.211:33093: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2109)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1618)
>     at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> 2009-04-14 16:08:12,502 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020, call batchUpdates([B@240affbc,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4e1ba220) from
> 192.168.33.212:48018: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2109)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1618)
>     at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> 2009-04-14 16:08:12,502 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020, call batchUpdates([B@78310aef,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@5bc50e8e) from
> 192.168.33.253:48798: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2109)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1618)
>     at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> 2009-04-14 16:08:12,503 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020, call batchUpdates([B@663ebbb3,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@20951936) from
> 192.168.34.2:52907: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2109)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1618)
>     at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> 2009-04-14 16:08:12,503 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020, call batchUpdates([B@1caa38f0,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@6b802343) from
> 192.168.33.238:34167: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2109)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1618)
>     at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> 2009-04-14 16:08:12,503 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020, call batchUpdates([B@298b3ad8,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@73c45036) from
> 192.168.33.236:45877: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2109)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1618)
>     at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> 2009-04-14 16:08:12,503 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020, call batchUpdates([B@5d6e449a,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@725a0a61) from
> 192.168.33.254:35363: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2109)
>     at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:1618)
>     at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>     at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:912)
> 2009-04-14 16:08:13,370 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 62020
> 2009-04-14 16:08:13,370 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 62020: exiting
> 2009-04-14 16:08:13,370 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
> 2009-04-14 16:08:13,370 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 16 on 62020: exiting
> 2009-04-14 16:08:13,370 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 3 on 62020: exiting
> 2009-04-14 16:08:13,370 INFO org.mortbay.util.ThreadedServer: Stopping
> Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=62030]
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server listener on 62020
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 62020: exiting
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 62020: exiting
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 62020: exiting
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 6 on 62020: exiting
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 10 on 62020: exiting
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 62020: exiting
> 2009-04-14 16:08:13,370 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 62020: exiting
> 2009-04-14 16:08:13,371 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 11 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 13 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 15 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 17 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 19 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 18 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 12 on 62020: exiting
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> IPC Server Responder
> 2009-04-14 16:08:13,372 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 62020: exiting
> 2009-04-14 16:08:13,464 INFO org.mortbay.http.SocketListener: Stopped
> SocketListener on 0.0.0.0:62030
> 2009-04-14 16:08:13,471 INFO org.mortbay.util.Container: Stopped
> HttpContext[/logs,/logs]
> 2009-04-14 16:08:13,471 INFO org.mortbay.util.Container: Stopped
> org.mortbay.jetty.servlet.WebApplicationHandler@460c5e9c
> 2009-04-14 16:08:14,887 INFO
> org.apache.hadoop.hbase.regionserver.LogFlusher:
> regionserver/0:0:0:0:0:0:0:0:62020.logFlusher exiting
> 2009-04-14 16:08:14,890 INFO org.apache.hadoop.hbase.Leases:
> regionserver/0:0:0:0:0:0:0:0:62020.leaseChecker closing leases
> 2009-04-14 16:08:14,890 INFO org.mortbay.util.Container: Stopped
> WebApplicationContext[/static,/static]
> 2009-04-14 16:08:14,890 INFO org.apache.hadoop.hbase.Leases:
> regionserver/0:0:0:0:0:0:0:0:62020.leaseChecker closed leases
> 2009-04-14 16:08:14,890 INFO org.mortbay.util.Container: Stopped
> org.mortbay.jetty.servlet.WebApplicationHandler@62c2ee15
> 2009-04-14 16:08:14,896 INFO org.mortbay.util.Container: Stopped
> WebApplicationContext[/,/]
> 2009-04-14 16:08:14,896 INFO org.mortbay.util.Container: Stopped
> org.mortbay.jetty.Server@3f829e6f
> 2009-04-14 16:08:14,896 INFO
> org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller exiting.
> 2009-04-14 16:08:14,896 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker:
> regionserver/0:0:0:0:0:0:0:0:62020.majorCompactionChecker exiting
> 2009-04-14 16:08:14,969 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, closed hlog
> 2009-04-14 16:08:14,969 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000145028698,1239695232467
> 2009-04-14 16:08:14,970 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000485488629,1239696366886
> 2009-04-14 16:08:14,970 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000030226388,1239695919978
> 2009-04-14 16:08:14,971 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000045007972,1239696394474
> 2009-04-14 16:08:14,971 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000370014326,1239695407460
> 2009-04-14 16:08:17,790 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
> 2009-04-14 16:08:46,566 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> compaction completed on region CDR,000315256623,1239695638429 in 1mins, 3sec
> 2009-04-14 16:08:46,566 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
> regionserver/0:0:0:0:0:0:0:0:62020.compactor exiting
> 2009-04-14 16:08:46,567 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000315256623,1239695638429
> 2009-04-14 16:08:46,568 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000555259592,1239696091451
> 2009-04-14 16:08:46,569 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000575345572,1239696111244
> 2009-04-14 16:08:46,570 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000515619625,1239696375751
> 2009-04-14 16:08:46,570 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000525154897,1239695988209
> 2009-04-14 16:08:46,570 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000220285104,1239696381168
> 2009-04-14 16:08:46,571 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000045190615,1239696394474
> 2009-04-14 16:08:46,572 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Closed CDR,000555161660,1239696091451
> 2009-04-14 16:08:46,572 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at:
> 192.168.33.215:62020
> 2009-04-14 16:08:46,684 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver/0:0:0:0:0:0:0:0:62020 exiting
> 2009-04-14 16:08:46,713 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Starting shutdown
> thread.
> 2009-04-14 16:08:46,714 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown thread complete
>
>
>     I restarted this region server and now it seems that it works just
> fine.
>
>
> 2009/4/14 11 Nov. <nov.eleventh@gmail.com>
>
> hi Jean-Daniel,
>>     As you said, we were inserting data using sequential pattern, and if
>> we use random pattern there would not be such prolem.
>>     I'm trying hbase 0.19.1 and the patch now.
>>     Thanks!
>>
>> 2009/4/13 Jean-Daniel Cryans <jdcryans@apache.org>
>>
>> I see that your region server had 5188 store files in 121 store, I'm
>>> 99% sure that it's the cause of your OOME. Luckily for you, we've been
>>> working on this issue since last week. What you should do :
>>>
>>> - Upgrade to HBase 0.19.1
>>>
>>> - Apply the latest patch in
>>> https://issues.apache.org/jira/browse/HBASE-1058 (the v3)
>>>
>>> Then you should be good. As to what caused this huge number of store
>>> files, I wouldn't be surprised if your data was uploaded sequentially
>>> so that would mean that whatever the number of regions (hence the
>>> level of distribution) in your table, only 1 region gets the load.
>>> This implies that another work around to your problem would be to
>>> insert with a more randomized pattern.
>>>
>>> Thx for trying either solution,
>>>
>>> J-D
>>>
>>> On Mon, Apr 13, 2009 at 8:28 AM, 11 Nov. <nov.eleventh@gmail.com> wrote:
>>> > hi coleagues,
>>> >    We are doing data inserting on 32 nodes hbase cluster using
>>> mapreduce
>>> > framework recently, but the operation always gets failed because of
>>> > regionserver exceptions. We issued 4 map task on the same node
>>> > simultaneously, and exploit the BatchUpdate() function to handle work
>>> of
>>> > inserting data.
>>> >    We had been suffered from such problem since last month, which only
>>> took
>>> > place on relatively large clusters at high concurrent inserting rate.
>>> We are
>>> > using hadoop-0.19.2 on current svn, and it's the head revision on svn
>>> last
>>> > week. We are using hbase 0.19.0.
>>> >
>>> >    Here is the configure file of hadoop-site.xml:
>>> >
>>> > <configuration>
>>> > <property>
>>> >  <name>fs.default.name</name>
>>> >  <value>hdfs://192.168.33.204:11004/</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.job.tracker</name>
>>> >  <value>192.168.33.204:11005</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.secondary.http.address</name>
>>> >  <value>0.0.0.0:51100</value>
>>> >  <description>
>>> >    The secondary namenode http server address and port.
>>> >    If the port is 0 then the server will start on a free port.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.datanode.address</name>
>>> >  <value>0.0.0.0:51110</value>
>>> >  <description>
>>> >    The address where the datanode server will listen to.
>>> >    If the port is 0 then the server will start on a free port.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.datanode.http.address</name>
>>> >  <value>0.0.0.0:51175</value>
>>> >  <description>
>>> >    The datanode http server address and port.
>>> >    If the port is 0 then the server will start on a free port.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.datanode.ipc.address</name>
>>> >  <value>0.0.0.0:11010</value>
>>> >  <description>
>>> >    The datanode ipc server address and port.
>>> >    If the port is 0 then the server will start on a free port.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.datanode.handler.count</name>
>>> >  <value>30</value>
>>> >  <description>The number of server threads for the
>>> datanode.</description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.namenode.handler.count</name>
>>> >  <value>30</value>
>>> >  <description>The number of server threads for the
>>> namenode.</description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.job.tracker.handler.count</name>
>>> >  <value>30</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.reduce.parallel.copies</name>
>>> >  <value>30</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.http.address</name>
>>> >  <value>0.0.0.0:51170</value>
>>> >  <description>
>>> >    The address and the base port where the dfs namenode web ui will
>>> listen
>>> > on.
>>> >    If the port is 0 then the server will start on a free port.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.datanode.max.xcievers</name>
>>> >  <value>8192</value>
>>> >  <description>
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.datanode.socket.write.timeout</name>
>>> >  <value>0</value>
>>> >  <description>
>>> >  </description>
>>> > </property>
>>> >
>>> >
>>> > <property>
>>> >  <name>dfs.datanode.https.address</name>
>>> >  <value>0.0.0.0:50477</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.https.address</name>
>>> >  <value>0.0.0.0:50472</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.job.tracker.http.address</name>
>>> >  <value>0.0.0.0:51130</value>
>>> >  <description>
>>> >    The job tracker http server address and port the server will listen
>>> on.
>>> >    If the port is 0 then the server will start on a free port.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.task.tracker.http.address</name>
>>> >  <value>0.0.0.0:51160</value>
>>> >  <description>
>>> >    The task tracker http server address and port.
>>> >    If the port is 0 then the server will start on a free port.
>>> >  </description>
>>> > </property>
>>> >
>>> >
>>> > <property>
>>> >  <name>mapred.map.tasks</name>
>>> >  <value>3</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.reduce.tasks</name>
>>> >  <value>2</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.tasktracker.map.tasks.maximum</name>
>>> >  <value>4</value>
>>> >  <description>
>>> >        The maximum number of map tasks that will be run simultaneously
>>> by a
>>> > task tracker.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.name.dir</name>
>>> >
>>> >
>>> <value>/data0/hbase/filesystem/dfs/name,/data1/hbase/filesystem/dfs/name,/data2/hbase/filesystem/dfs/name,/data3/hbase/filesystem/dfs/name</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.data.dir</name>
>>> >
>>> >
>>> <value>/data0/hbase/filesystem/dfs/data,/data1/hbase/filesystem/dfs/data,/data2/hbase/filesystem/dfs/data,/data3/hbase/filesystem/dfs/data</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>fs.checkpoint.dir</name>
>>> >
>>> >
>>> <value>/data0/hbase/filesystem/dfs/namesecondary,/data1/hbase/filesystem/dfs/namesecondary,/data2/hbase/filesystem/dfs/namesecondary,/data3/hbase/filesystem/dfs/namesecondary</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.system.dir</name>
>>> >  <value>/data1/hbase/filesystem/mapred/system</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.local.dir</name>
>>> >
>>> >
>>> <value>/data0/hbase/filesystem/mapred/local,/data1/hbase/filesystem/mapred/local,/data2/hbase/filesystem/mapred/local,/data3/hbase/filesystem/mapred/local</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>dfs.replication</name>
>>> >  <value>3</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>hadoop.tmp.dir</name>
>>> >  <value>/data1/hbase/filesystem/tmp</value>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>mapred.task.timeout</name>
>>> >  <value>3600000</value>
>>> >  <description>The number of milliseconds before a task will be
>>> >  terminated if it neither reads an input, writes an output, nor
>>> >  updates its status string.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <name>ipc.client.idlethreshold</name>
>>> >  <value>4000</value>
>>> >  <description>Defines the threshold number of connections after which
>>> >               connections will be inspected for idleness.
>>> >  </description>
>>> > </property>
>>> >
>>> >
>>> > <property>
>>> >  <name>ipc.client.connection.maxidletime</name>
>>> >  <value>120000</value>
>>> >  <description>The maximum time in msec after which a client will bring
>>> down
>>> > the
>>> >               connection to the server.
>>> >  </description>
>>> > </property>
>>> >
>>> > <property>
>>> >  <value>-Xmx256m -XX:+UseConcMarkSweepGC
>>> -XX:+CMSIncrementalMode</value>
>>> > </property>
>>> >
>>> > </configuration>
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >    And here is the hbase-site.xml config file:
>>> >
>>> > <?xml version="1.0"?>
>>> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>> >
>>> > <configuration>
>>> >  <property>
>>> >    <name>hbase.master</name>
>>> >    <value>192.168.33.204:62000</value>
>>> >    <description>The host and port that the HBase master runs at.
>>> >    A value of 'local' runs the master and a regionserver in
>>> >    a single process.
>>> >    </description>
>>> >  </property>
>>> >  <property>
>>> >    <name>hbase.rootdir</name>
>>> >    <value>hdfs://192.168.33.204:11004/hbase</value>
>>> >    <description>The directory shared by region servers.
>>> >    Should be fully-qualified to include the filesystem to use.
>>> >    E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR
>>> >    </description>
>>> >  </property>
>>> >
>>> >  <property>
>>> >    <name>hbase.master.info.port</name>
>>> >    <value>62010</value>
>>> >    <description>The port for the hbase master web UI
>>> >    Set to -1 if you do not want the info server to run.
>>> >    </description>
>>> >  </property>
>>> >  <property>
>>> >    <name>hbase.master.info.bindAddress</name>
>>> >    <value>0.0.0.0</value>
>>> >    <description>The address for the hbase master web UI
>>> >    </description>
>>> >  </property>
>>> >  <property>
>>> >    <name>hbase.regionserver</name>
>>> >    <value>0.0.0.0:62020</value>
>>> >    <description>The host and port a HBase region server runs at.
>>> >    </description>
>>> >  </property>
>>> >
>>> >  <property>
>>> >    <name>hbase.regionserver.info.port</name>
>>> >    <value>62030</value>
>>> >    <description>The port for the hbase regionserver web UI
>>> >    Set to -1 if you do not want the info server to run.
>>> >    </description>
>>> >  </property>
>>> >  <property>
>>> >    <name>hbase.regionserver.info.bindAddress</name>
>>> >    <value>0.0.0.0</value>
>>> >    <description>The address for the hbase regionserver web UI
>>> >    </description>
>>> >  </property>
>>> >
>>> >  <property>
>>> >    <name>hbase.regionserver.handler.count</name>
>>> >    <value>20</value>
>>> >  </property>
>>> >
>>> >  <property>
>>> >    <name>hbase.master.lease.period</name>
>>> >    <value>180000</value>
>>> >  </property>
>>> >
>>> > </configuration>
>>> >
>>> >
>>> >    Here is a slice of the error log file on one of the failed
>>> > regionservers, which lose response after the OOM Exception:
>>> >
>>> > 2009-04-13 15:20:26,077 FATAL
>>> > org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
>>> > aborting.
>>> > java.lang.OutOfMemoryError: Java heap space
>>> > 2009-04-13 15:20:48,062 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
>>> > request=0, regions=121, stores=121, storefiles=5188,
>>> storefileIndexSize=195,
>>> > memcacheSize=214, usedHeap=4991, maxHeap=4991
>>> > 2009-04-13 15:20:48,062 INFO org.apache.hadoop.ipc.HBaseServer:
>>> Stopping
>>> > server on 62020
>>> > 2009-04-13 15:20:48,063 INFO
>>> > org.apache.hadoop.hbase.regionserver.LogFlusher:
>>> > regionserver/0:0:0:0:0:0:0:0:62020.logFlusher exiting
>>> > 2009-04-13 15:20:48,201 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
>>> > 2009-04-13 15:20:48,228 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@74f0bb4e,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@689939dc) from
>>> > 192.168.33.206:47754: output error
>>> > 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 5 on 62020 caught: java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 5 on 62020: exiting
>>> > 2009-04-13 15:20:48,297 INFO org.apache.hadoop.ipc.HBaseServer:
>>> Stopping IPC
>>> > Server Responder
>>> > 2009-04-13 15:20:48,552 INFO org.apache.zookeeper.ClientCnxn:
>>> Attempting
>>> > connection to server /192.168.33.204:2181
>>> > 2009-04-13 15:20:48,552 WARN org.apache.zookeeper.ClientCnxn: Exception
>>> > closing session 0x0 to sun.nio.ch.SelectionKeyImpl@480edf31
>>> > java.io.IOException: TIMED OUT
>>> >    at
>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:837)
>>> > 2009-04-13 15:20:48,555 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 9 on 62020, call batchUpdates([B@3509aa7f,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from
>>> 192.168.33.234:44367:
>>> > error: java.io.IOException: Server not running, aborting
>>> > java.io.IOException: Server not running, aborting
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304)
>>> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> >    at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >    at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>>> > 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@525a19ce,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@19544d9f) from
>>> > 192.168.33.208:47852: output error
>>> > 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@483206fe,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4c6932b9) from
>>> > 192.168.33.221:37020: output error
>>> > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 0 on 62020 caught: java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 0 on 62020: exiting
>>> > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 7 on 62020 caught: java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:48,655 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 7 on 62020: exiting
>>> > 2009-04-13 15:20:48,692 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@61af3c0e,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@378fed3c) from
>>> 192.168.34.1:35923:
>>> > output error
>>> > 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@2c4ff8dd,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@365b8be5) from
>>> 192.168.34.3:39443:
>>> > output error
>>> > 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 16 on 62020 caught:
>>> java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 16 on 62020: exiting
>>> > 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@343d8344,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@32750027) from
>>> > 192.168.33.236:45479: output error
>>> > 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 17 on 62020 caught:
>>> java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 17 on 62020: exiting
>>> > 2009-04-13 15:20:48,654 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@3ff34fed,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@7f047167) from
>>> > 192.168.33.219:40059: output error
>>> > 2009-04-13 15:20:48,654 ERROR
>>> com.cmri.hugetable.zookeeper.ZNodeWatcher:
>>> > processNode /hugetable09/hugetable/acl.lock error!KeeperErrorCode =
>>> > ConnectionLoss
>>> > 2009-04-13 15:20:48,649 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@721d9b81,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@75cc6cae) from
>>> > 192.168.33.254:51617: output error
>>> > 2009-04-13 15:20:48,649 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 12 on 62020, call batchUpdates([B@655edc27,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from
>>> > 192.168.33.238:51231: error: java.io.IOException: Server not running,
>>> > aborting
>>> > java.io.IOException: Server not running, aborting
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304)
>>> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> >    at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >    at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>>> > 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@3c853cce,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4f5b176c) from
>>> > 192.168.33.209:43520: output error
>>> > 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 4 on 62020 caught: java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:49,226 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 4 on 62020: exiting
>>> > 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@3509aa7f,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from
>>> 192.168.33.234:44367:
>>> > output error
>>> > 2009-04-13 15:20:48,647 INFO org.mortbay.util.ThreadedServer: Stopping
>>> > Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=62030]
>>> > 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 9 on 62020 caught: java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 9 on 62020: exiting
>>> > 2009-04-13 15:20:48,646 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 2 on 62020, call batchUpdates([B@2cc91b6,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from
>>> > 192.168.33.210:44154: error: java.io.IOException: Server not running,
>>> > aborting
>>> > java.io.IOException: Server not running, aborting
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304)
>>> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> >    at
>>> >
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >    at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>>> > 2009-04-13 15:20:48,572 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@e8136e0,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4539b390) from
>>> > 192.168.33.217:60476: output error
>>> > 2009-04-13 15:20:49,272 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@2cc91b6,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from
>>> > 192.168.33.210:44154: output error
>>> > 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 8 on 62020 caught: java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 8 on 62020: exiting
>>> > 2009-04-13 15:20:49,263 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@655edc27,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from
>>> > 192.168.33.238:51231: output error
>>> > 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 1 on 62020 caught: java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:49,068 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 14 on 62020 caught:
>>> java.nio.channels.ClosedByInterruptException
>>> >    at
>>> >
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> > 2009-04-13 15:20:49,345 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 14 on 62020: exiting
>>> > 2009-04-13 15:20:49,048 ERROR
>>> > org.apache.hadoop.hbase.regionserver.HRegionServer:
>>> > java.lang.OutOfMemoryError: Java heap space
>>> > 2009-04-13 15:20:49,484 FATAL
>>> > org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
>>> > aborting.
>>> > java.lang.OutOfMemoryError: Java heap space
>>> >    at
>>> >
>>> java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.java:1187)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(HRegionServer.java:2863)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemory(MemcacheFlusher.java:260)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2307)
>>> >    at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>>> >    at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>>> > 2009-04-13 15:20:49,488 INFO
>>> > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
>>> > request=0, regions=121, stores=121, storefiles=5188,
>>> storefileIndexSize=195,
>>> > memcacheSize=214, usedHeap=4985, maxHeap=4991
>>> > 2009-04-13 15:20:49,489 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 15 on 62020, call batchUpdates([B@302bb17f,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from
>>> 192.168.33.235:35276:
>>> > error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space
>>> > java.io.IOException: java.lang.OutOfMemoryError: Java heap space
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1334)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1324)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2320)
>>> >    at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>>> >    at
>>> >
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >    at java.lang.reflect.Method.invoke(Method.java:597)
>>> >    at
>>> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
>>> > Caused by: java.lang.OutOfMemoryError: Java heap space
>>> >    at
>>> >
>>> java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.java:1187)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(HRegionServer.java:2863)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemory(MemcacheFlusher.java:260)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2307)
>>> >    ... 5 more
>>> > 2009-04-13 15:20:49,490 WARN org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > Responder, call batchUpdates([B@302bb17f,
>>> > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from
>>> 192.168.33.235:35276:
>>> > output error
>>> > 2009-04-13 15:20:49,047 INFO org.apache.hadoop.ipc.HBaseServer:
>>> Stopping IPC
>>> > Server listener on 62020
>>> > 2009-04-13 15:20:49,493 INFO org.apache.hadoop.ipc.HBaseServer: IPC
>>> Server
>>> > handler 15 on 62020 caught: java.nio.channels.ClosedChannelException
>>> >    at
>>> >
>>> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>>> >    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>>> >    at
>>> >
>>> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>>> >
>>> >    Any suggenstion is welcomed! Thanks a lot!
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message