hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinsong Hu" <jinsong...@hotmail.com>
Subject regionserver crash under heavy load
Date Tue, 13 Jul 2010 21:49:01 GMT
Hi, Todd:
  I downloaded hadoop-0.20.2+320 and hbase-0.89.20100621+17 from CDH3 and 
inserted data with full load, after a while the hbase regionserver crashed. 
I checked  system with "iostat -x 5" and notice the disk is pretty busy. 
Then I modified my client code and reduced the insertion rate by 6 times, 
and the test runs fine.  Is there any way that regionserver be modified so 
that at least it doesn't crash under heavy load ?  I used apache hbase 
0.20.5 distribution and the same problem happens. I am thinking that when 
the regionserver is too busy, it should throttle incoming data rate to 
protect the server.  Could this be done ?
   Do you also know when the CDH3 official release will come out ? the one I 
downloaded is beta version.

Jimmy






2010-07-13 02:24:34,389 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Close
d Spam_MsgEventTable,56-2010-05-19 
10:09:02\x099a420f4f31748828fd24aeea1d06b294,
1278973678315.01dd22f517dabf53ddd135709b68ba6c.
2010-07-13 02:24:34,389 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer:
 aborting server at: m0002029.ppops.net,60020,1278969481450
2010-07-13 02:24:34,389 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper
: Closed connection with ZooKeeper; /hbase/root-region-server
2010-07-13 02:24:34,389 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer:
 regionserver60020 exiting
2010-07-13 02:24:34,608 INFO 
org.apache.hadoop.hbase.regionserver.ShutdownHook:
Shutdown hook starting; hbase.shutdown.hook=true; 
fsShutdownHook=Thread[Thread-1
0,5,main]
2010-07-13 02:24:34,608 INFO 
org.apache.hadoop.hbase.regionserver.ShutdownHook:
Starting fs shutdown hook thread.
2010-07-13 02:24:34,608 ERROR org.apache.hadoop.hdfs.DFSClient: Exception 
closin
g file 
/hbase/.logs/m0002029.ppops.net,60020,1278969481450/10.110.24.79%3A60020.
1278987220794 : java.io.IOException: IOException flush:java.io.IOException: 
IOEx
ception flush:java.io.IOException: IOException flush:java.io.IOException: 
IOExce
ption flush:java.io.IOException: IOException flush:java.io.IOException: 
IOExcept
ion flush:java.io.IOException: IOException flush:java.io.IOException: 
IOExceptio
n flush:java.io.IOException: IOException flush:java.io.IOException: Error 
Recove
ry for block blk_-1605696159279298313_2395924 failed  because recovery from 
prim
ary datanode 10.110.24.80:50010 failed 6 times.  Pipeline was 
10.110.24.80:50010
. Aborting...
java.io.IOException: IOException flush:java.io.IOException: IOException 
flush:ja
va.io.IOException: IOException flush:java.io.IOException: IOException 
flush:java
.io.IOException: IOException flush:java.io.IOException: IOException 
flush:java.i
o.IOException: IOException flush:java.io.IOException: IOException 
flush:java.io.
IOException: IOException flush:java.io.IOException: Error Recovery for block 
blk
_-1605696159279298313_2395924 failed  because recovery from primary datanode 
10.
110.24.80:50010 failed 6 times.  Pipeline was 10.110.24.80:50010. 
Aborting...
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:
3214)
        at 
org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:
97)
        at 
org.apache.hadoop.io.SequenceFile$Writer.syncFs(SequenceFile.java:944
)
        at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.sync(S
equenceFileLogWriter.java:124)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:826)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1004)
        at 
org.apache.hadoop.hbase.regionserver.wal.HLog.append(HLog.java:817)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.j
ava:1531)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1447)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.
java:1703)
        at 
org.apache.hadoop.hbase.regionserver.HRegionServer.multiPut(HRegionSe
rver.java:2361)
        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:576)
        at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:
919)
2010-07-13 02:24:34,610 ERROR org.apache.hadoop.hdfs.DFSClient: Exception 
closin
g file 
/hbase/Spam_MsgEventTable/079c7de876422e57e5f09fef5d997e06/.tmp/677365813
4549268273 : java.io.IOException: All datanodes 10.110.24.80:50010 are bad. 
Abor
ting...
java.io.IOException: All datanodes 10.110.24.80:50010 are bad. Aborting...
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError
(DFSClient.java:2603)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClien
t.java:2139)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFS
Client.java:2306)
2010-07-13 02:24:34,729 INFO 
org.apache.hadoop.hbase.regionserver.ShutdownHook:
Shutdown hook finished. 


Mime
View raw message