hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: region server shutdown, why?
Date Thu, 16 Dec 2010 19:44:56 GMT
If loading is the only thing you want to do, please use
HFileOutputFormat, else you probably have GC pause issues (search this
mailing list for advices) or you have a configuration issue, usually
the xcievers or ulimit as described here
http://hbase.apache.org/docs/r0.20.6/api/overview-summary.html#requirements

J-D

On Thu, Dec 16, 2010 at 1:25 AM, Zhou Shuaifeng
<zhoushuaifeng@huawei.com> wrote:
> Hello,
>
>
>
> I'm doing performance test on  hbase 0.20.6. My cluster contains 7
> regionservers.   After lodding  4TB data, one regionserver shutdown.
>
> Exception  info in the log is below. Can someone tell me what's the matter?
>
>
>
> Regionserver log:
>
>
>
> 2010-12-16 09:42:09,724 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer
> Exception: java.io.IOException: Connection reset by peer
>
>       at sun.nio.ch.FileDispatcher.write0(Native Method)
>
>       at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
>
>       at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:100)
>
>       at sun.nio.ch.IOUtil.write(IOUtil.java:71)
>
>       at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
>
>       at
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream
> .java:55)
>
>       at
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:143)
>
>       at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>
>       at
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>
>       at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>
>       at java.io.DataOutputStream.write(DataOutputStream.java:90)
>
>       at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.
> java:2904)
>
>
>
> namenode log:
>
>
>
> 2010-12-16 09:39:04,859 INFO  FSNamesystem.audit
> (FSNamesystem.java:logAuditEvent(188)) - ugi=root,root,sfcb   ip=/167.6.5.13
> cmd=create
> src=/hbase/ufdr/compaction.dir/811978903/4122372667779754772
> perm=root:supergroup:rw-r--r--
>
> 2010-12-16 09:39:04,865 INFO  hdfs.StateChange
> (FSNamesystem.java:allocateBlock(1627)) - BLOCK* NameSystem.allocateBlock:
> /hbase/ufdr/compaction.dir/811978903/4122372667779754772.
> blk_1292432378799_352605
>
> 2010-12-16 09:39:05,084 INFO  FSNamesystem.audit
> (FSNamesystem.java:logAuditEvent(202)) - ugi=root,root,sfcb   ip=/167.6.5.14
> cmd=delete
> src=/hbase/.logs/c3s5.site,60020,1292432509342/hlog.dat.1292463618818
>
> 2010-12-16 09:39:05,098 INFO  namenode.NameNode
> (NameNode.java:errorReport(811)) - Error report from 167.6.5.14:50010:
> DataNode failed volumes:/hdfsdata/1/current;
>
> 2010-12-16 09:39:05,099 WARN  namenode.NameNode
> (NameNode.java:errorReport(817)) - Volume failed on 167.6.5.14:50010
>
> 2010-12-16 09:39:05,464 INFO  hdfs.StateChange
> (FSNamesystem.java:allocateBlock(1627)) - BLOCK* NameSystem.allocateBlock:
> /hbase/ufdr/565226849/value/9091457825705166400. blk_1292432378800_352605
>
> 2010-12-16 09:39:05,481 INFO  namenode.FSNamesystem
> (FSNamesystem.java:nextGenerationStampForBlock(5058)) -
> blk_1292432378630_352428 is already commited, storedBlock == null.
>
> 2010-12-16 09:39:05,481 INFO  ipc.Server (Server.java:run(1000)) - IPC
> Server handler 1 on 9000, call nextGenerationStamp(blk_1292432378630_352428)
> from 167.6.5.11:37392: error:
>
> java.io.IOException: blk_1292432378630_352428 is already commited,
> storedBlock == null.
>
>       at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBl
> ock(FSNamesystem.java:5059)
>
>       at
> org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode
> .java:540)
>
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>
>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>
>       at java.lang.reflect.Method.invoke(Method.java:597)
>
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:514)
>
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:990)
>
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:986)
>
>       at java.security.AccessController.doPrivileged(Native Method)
>
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:984)
>
> 2010-12-16 09:39:05,801 INFO  hdfs.StateChange
> (FSNamesystem.java:addStoredBlock(3326)) - BLOCK* NameSystem.addStoredBlock:
> blockMap updated: 167.6.5.12:50010 is added to blk_1292432378796_352603 size
> 42162794
>
> 2010-12-16 09:39:05,802 INFO  hdfs.StateChange
> (FSNamesystem.java:addStoredBlock(3326)) - BLOCK* NameSystem.addStoredBlock:
> blockMap updated: 167.6.5.11:50010 is added to blk_1292432378796_352603 size
> 42162794
>
> 2010-12-16 09:39:05,804 INFO  hdfs.StateChange
> (FSNamesystem.java:addStoredBlock(3326)) - BLOCK* NameSystem.addStoredBlock:
> blockMap updated: 167.6.5.17:50010 is added to blk_1292432378796_352603 size
> 42162794
>
> 2010-12-16 09:39:05,804 INFO  hdfs.StateChange
> (FSNamesystem.java:completeFileInternal(1576)) - DIR*
> NameSystem.completeFile: file
> /hbase/ufdr/326058079/value/1061306473103433726 is closed by
> DFSClient_-1318892215
>
> 2010-12-16 09:39:05,809 INFO  FSNamesystem.audit
> (FSNamesystem.java:logAuditEvent(188)) - ugi=root,root,sfcb   ip=/167.6.5.17
> cmd=open      src=/hbase/ufdr/326058079/value/1061306473103433726
> perm=root:supergroup:rw-r--r--
>
> 2010-12-16 09:39:05,813 INFO  hdfs.StateChange
> (FSNamesystem.java:addStoredBlock(3326)) - BLOCK* NameSystem.addStoredBlock:
> blockMap updated: 167.6.5.15:50010 is added to blk_1292432378788_352600 size
> 5756003
>
> 2010-12-16 09:39:05,815 INFO  hdfs.StateChange
> (FSNamesystem.java:addStoredBlock(3326)) - BLOCK* NameSystem.addStoredBlock:
> blockMap updated: 167.6.5.12:50010 is added to blk_1292432378788_352600 size
> 5756003
>
> 2010-12-16 09:39:05,815 INFO  hdfs.StateChange
> (FSNamesystem.java:completeFileInternal(1576)) - DIR*
> NameSystem.completeFile: file /hbase/ufdr/73697944/value/278352731434344857
> is closed by DFSClient_-961611556
>
> 2010-12-16 09:39:05,825 INFO  FSNamesystem.audit
> (FSNamesystem.java:logAuditEvent(188)) - ugi=root,root,sfcb   ip=/167.6.5.12
> cmd=open      src=/hbase/ufdr/73697944/value/278352731434344857
> perm=root:supergroup:rw-r--r--
>
> 2010-12-16 09:39:05,861 INFO  namenode.FSNamesystem
> (FSNamesystem.java:nextGenerationStampForBlock(5058)) -
> blk_1292432378630_352428 is already commited, storedBlock == null.
>
> 2010-12-16 09:39:05,861 INFO  ipc.Server (Server.java:run(1000)) - IPC
> Server handler 6 on 9000, call nextGenerationStamp(blk_1292432378630_352428)
> from 167.6.5.11:37392: error:
>
> java.io.IOException: blk_1292432378630_352428 is already commited,
> storedBlock == null.
>
>       at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.nextGenerationStampForBl
> ock(FSNamesystem.java:5059)
>
>       at
> org.apache.hadoop.hdfs.server.namenode.NameNode.nextGenerationStamp(NameNode
> .java:540)
>
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>       at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39
> )
>
>       at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:25)
>
>       at java.lang.reflect.Method.invoke(Method.java:597)
>
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:514)
>
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:990)
>
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:986)
>
>       at java.security.AccessController.doPrivileged(Native Method)
>
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:984)
>
> 2010-12-16 09:39:05,883 INFO  FSNamesystem.audit
> (FSNamesystem.java:logAuditEvent(188)) - ugi=root,root,sfcb   ip=/167.6.5.17
> cmd=create    src=/hbase/ufdr/238795490/value/543132150737023964
> perm=root:supergroup:rw-r--r--
>
> 2010-12-16 09:39:05,893 INFO  hdfs.StateChange
> (FSNamesystem.java:allocateBlock(1627)) - BLOCK* NameSystem.allocateBlock:
> /hbase/ufdr/238795490/value/543132150737023964. blk_1292432378801_352606
>
> 2010-12-16 09:39:05,903 INFO  hdfs.StateChange
> (FSNamesystem.java:addStoredBlock(3326)) - BLOCK* NameSystem.addStoredBlock:
> blockMap updated: 167.6.5.11:50010 is added to blk_1292432378789_352600 size
> 6049189
>
> 2010-12-16 09:39:05,904 INFO  hdfs.StateChange
> (FSNamesystem.java:addStoredBlock(3326)) - BLOCK* NameSystem.addStoredBlock:
> blockMap updated: 167.6.5.12:50010 is added to blk_1292432378789_352600 size
> 6049189
>
>
>
> Data node log:
>
>
>
> 2010-12-16 08:25:32,242 INFO  DataNode.clienttrace
> (BlockSender.java:sendBlock(487)) - src: /167.6.5.14:50010, dest:
> /167.6.5.14:36758, bytes: 3138, op: HDFS_READ, cliID: DFSClient_708626937,
> srvID: DS-814382105-167.6.5.14-50010-1291837985732, blockid:
> blk_1292432361424_333765
>
> 2010-12-16 08:25:32,426 WARN  datanode.DataNode
> (DataXceiver.java:readBlock(276)) - DatanodeRegistration(167.6.5.14:50010,
> storageID=DS-814382105-167.6.5.14-50010-1291837985732, infoPort=50075,
> ipcPort=50020):Got exception while serving blk_1292432361541_333880 to
> /167.6.5.14:
>
> java.io.IOException: Block blk_1292432361541_333880 is not valid.
>
>       at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java
> :962)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:94
> 9)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:1
> 02)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.jav
> a:240)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:132)
>
>       at java.lang.Thread.run(Thread.java:662)
>
>
>
> 2010-12-16 08:25:32,427 ERROR datanode.DataNode (DataXceiver.java:run(185))
> - org.apache.hadoop.hdfs.server.datanode.DataNode (167.6.5.14:50010,
> storageID=DS-814382105-167.6.5.14-50010-1291837985732, infoPort=50075,
> ipcPort=50020):DataXceiver
>
> java.io.IOException: Block blk_1292432361541_333880 is not valid.
>
>       at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java
> :962)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.FSDataset.getLength(FSDataset.java:94
> 9)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:1
> 02)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.jav
> a:240)
>
>       at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:132)
>
>       at java.lang.Thread.run(Thread.java:662)
>
> 2010-12-16 08:25:33,419 INFO  datanode.DataNode
> (DataXceiver.java:writeBlock(307)) - Receiving block
> blk_1292432365462_338068 src: /167.6.5.14:36774 dest: /167.6.5.14:50010
>
> 2010-12-16 08:25:33,427 INFO  datanode.DataNode
> (FSDataset.java:checkDirs(754)) - Completed FSVolumeSet.checkDirs.
> Removed=0volumes. List of current volumes:
> /hdfsdata/0/current,/hdfsdata/1/current,/hdfsdata/2/current,/hdfsdata/3/curr
> ent,/hdfsdata/4/current,/hdfsdata/5/current,/hdfsdata/6/current
>
> 2010-12-16 08:25:33,489 INFO  datanode.DataNode
> (DataXceiver.java:writeBlock(307)) - Receiving block
> blk_1292432365464_338069 src: /167.6.5.17:51536 dest: /167.6.5.14:50010
>
>
>
>
>
>
>
> ----------------------------------------------------------------------------
> ---------------------------------------------------------
> This e-mail and its attachments contain confidential information from
> HUAWEI, which
> is intended only for the person or entity whose address is listed above. Any
> use of the
> information contained herein in any way (including, but not limited to,
> total or partial
> disclosure, reproduction, or dissemination) by persons other than the
> intended
> recipient(s) is prohibited. If you receive this e-mail in error, please
> notify the sender by
> phone or email immediately and delete it!
>
>

Mime
View raw message