hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tao Xiao <xiaotao.cs....@gmail.com>
Subject Re: Some block can not be found while inserting data into HBase
Date Wed, 28 May 2014 03:28:42 GMT
fsck on /apps/hbase says that it is healthy


2014-05-28 10:58 GMT+08:00 Bharath Vissapragada <bharathv@cloudera.com>:

> Run an fsck on /hbase to check if there are any inconsistencies.
>
>
> On Wed, May 28, 2014 at 6:23 AM, Tao Xiao <xiaotao.cs.nju@gmail.com>
> wrote:
>
> > I‘m using HDP 2.0.6
> >
> >
> > 2014-05-28 0:03 GMT+08:00 Ted Yu <yuzhihong@gmail.com>:
> >
> > > What hbase / hadoop release are you using ?
> > >
> > > Cheers
> > >
> > >
> > > On Tue, May 27, 2014 at 4:25 AM, Tao Xiao <xiaotao.cs.nju@gmail.com>
> > > wrote:
> > >
> > > > I put massive records into HBase and found that one of the region
> > servers
> > > > crashed. I checked the RS log and NameNode log and found them
> > complaining
> > > > that some block does not exist.
> > > >
> > > > For example:
> > > >
> > > > *In RS's log:*
> > > > java.io.IOException: Bad response ERROR for block
> > > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 from
> > > > datanode 10.134.101.110:50010
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:732)
> > > >  2014-05-27 16:18:06,184 WARN  [ResponseProcessor for block
> > > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867*]
> > > > hdfs.DFSClient: DFSOutputStream ResponseProcessor exce     ption  for
> > > block
> > > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867*
> > > >  java.io.EOFException: Premature EOF: no length prefix available
> > > >          at
> > > >
> > >
> >
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1492)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116198
> > > >
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:721)
> > > >  2014-05-27 16:18:06,184 WARN  [DataStreamer for file
> > > > /apps/hbase/data/WALs/
> > > > b08.jsepc.com,60020,1400569539507/b08.jsepc.com
> > > > %2C60020%2C1400569539507.1401178572707
> > > > block BP-898918553
> > > > -10.134.101.112-1393904898674:blk_1078908601_5176869] hdfs.DFSClient:
> > > Error
> > > > Recovery for block
> > > > BP-898918553-10.134.101.112-1393904898674:blk_1078908601_5176869 in
> > > > pipeline 10.134.     101.118:50010, 10.134.101.102:50010,
> > > > 10.134.101.104:50010: bad datanode 10.134.101.104:50010
> > > >
> > > > 2014-05-27 16:18:06,184 WARN  [DataStreamer for file
> > > /apps/hbase/data/WALs/
> > > > b08.jsepc.com,60020,1400569539507/b08.jsepc.com
> > > > %2C60020%2C1400569539507.1401178568958
> > > > block BP-898918553
> > > > -10.134.101.112-1393904898674:blk_1078908602_5176870] hdfs.DFSClient:
> > > > DataStreamer Exception
> > > >  java.io.IOException: Broken pipe
> > > >          at sun.nio.ch.FileDispatcher.write0(Native Method)
> > > >          at
> sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
> > > >          at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69)
> > > >          at sun.nio.ch.IOUtil.write(IOUtil.java:40)
> > > >          at
> > > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
> > > >          at
> > > >
> > >
> >
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
> > > >          at
> > > >
> > >
> >
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
> > > >          at
> > > > java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
> > > >          at java.io.DataOutputStream.write(DataOutputStream.java:90)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.java:278)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:568)
> > > >  2014-05-27 16:18:06,188 WARN  [DataStreamer for file
> > > > /apps/hbase/data/WALs/
> > > > b08.jsepc.com,60020,1400569539507/b08.jsepc.com
> > > > %2C60020%2C1400569539507.1401178568958
> > > > block BP-898918553
> > > > -10.134.101.112-1393904898674:blk_1078908602_5176870] hdfs.DFSClient:
> > > Error
> > > > Recovery for block
> > > > BP-898918553-10.134.101.112-1393904898674:blk_1078908602_5176870 in
> > > > pipeline 10.134.     101.118:50010, 10.134.101.108:50010,
> > > > 10.134.101.105:50010: bad datanode 10.134.101.108:50010
> > > >  225 2014-05-27 16:18:06,184 INFO  [284848237@qtp-396793761-1 -
> > > Acceptor0
> > > > SelectChannelConnector@0.0.0.0:60030] mortbay.log:
> > > > org.mortbay.io.nio.SelectorManager$SelectSet@754c6090 JVM B
> UG(s)
> > -
> > > > recreating selector 17 times, canceled keys 295 times
> > > >  226 2014-05-27 16:18:06,209 FATAL [regionserver60020]
> > > > regionserver.HRegionServer: ABORTING region server
> > > > b08.jsepc.com,60020,1400569539507:
> > > > org.apache.hadoop.hbase.YouAreDeadException     : Server REPORT
> > rejected;
> > > > currently processing b08.jsepc.com,60020,1400569539507 as dead
> server
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:341)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:254)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1342)
> > > >          at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:5087)
> > > >          at
> > > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
> > > >          at
> > > >
> org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)
> > > >
> > > >  org.apache.hadoop.hbase.YouAreDeadException:
> > > > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
> > > > currently processing b08.jsepc.com,60020,1400569539507 as dead
> server
> > > >
> > > >
> > > > *In namenode's log:*
> > > > 2014-05-27 16:18:04,593 INFO  BlockStateChange
> > > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock:
> > > > blockMap updated: 10.134.101.114:50010 is added to
> > > > blk_1078908684_5176954{blockUCState=UNDER_CONSTRUCTION,
> > > > primaryNodeIndex=-1,
> > > > replicas=[ReplicaUnderConstruction[10.134.101.107:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.114:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.119:50010|RBW]]} size 0
> > > > 2014-05-27 16:18:04,593 INFO  BlockStateChange
> > > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock:
> > > > blockMap updated: 10.134.101.107:50010 is added to
> > > > blk_1078908684_5176954{blockUCState=UNDER_CONSTRUCTION,
> > > > primaryNodeIndex=-1,
> > > > replicas=[ReplicaUnderConstruction[10.134.101.107:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.114:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.119:50010|RBW]]} size 0
> > > > 2014-05-27 16:18:04,596 INFO  hdfs.StateChange
> > > > (FSNamesystem.java:completeFile(2814)) - DIR* completeFile:
> > > >
> > > >
> > >
> >
> /apps/hbase/data/data/default/Test-QSH-ARCHIVES/4baa021e822ca4843d64af2e3641deab/.tmp/2d251bb78829442ea22a8031e58721a0
> > > > is closed by DFSClient_hb_rs_a07.jsepc.com
> > > > ,60020,1400569539807_-1204612826_29
> > > > 2014-05-27 16:18:04,876 INFO  BlockStateChange
> > > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock:
> > > > blockMap updated: 10.134.101.108:50010 is added to
> > > > blk_1078908671_5176941{blockUCState=UNDER_CONSTRUCTION,
> > > > primaryNodeIndex=-1,
> > > > replicas=[ReplicaUnderConstruction[10.134.101.120:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.105:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.108:50010|RBW]]} size 0
> > > > 2014-05-27 16:18:04,877 INFO  BlockStateChange
> > > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock:
> > > > blockMap updated: 10.134.101.105:50010 is added to
> > > > blk_1078908671_5176941{blockUCState=UNDER_CONSTRUCTION,
> > > > primaryNodeIndex=-1,
> > > > replicas=[ReplicaUnderConstruction[10.134.101.120:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.105:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.108:50010|RBW]]} size 0
> > > > 2014-05-27 16:18:04,878 INFO  BlockStateChange
> > > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock:
> > > > blockMap updated: 10.134.101.120:50010 is added to
> > > > blk_1078908671_5176941{blockUCState=UNDER_CONSTRUCTION,
> > > > primaryNodeIndex=-1,
> > > > replicas=[ReplicaUnderConstruction[10.134.101.120:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.105:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.108:50010|RBW]]} size 0
> > > > 2014-05-27 16:18:04,880 INFO  hdfs.StateChange
> > > > (FSNamesystem.java:completeFile(2814)) - DIR* completeFile:
> > > >
> > > >
> > >
> >
> /apps/hbase/data/data/default/Test-QSH-ARCHIVES/872cb2b3a7c3b0ef9c3982abea565329/.tmp/e2b00b2387124e79a87858afc21024fc
> > > > is closed by DFSClient_hb_rs_b10.jsepc.com
> > > > ,60020,1400569547596_-57117285_29
> > > > 2014-05-27 16:18:06,193 ERROR security.UserGroupInformation
> > > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException
> > > > as:hadoop (auth:SIMPLE) cause:java.io.IOException:
> > > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867*
> does
> > > not
> > > > exist or is not under Constructionnull
> > > > 2014-05-27 16:18:06,193 INFO  ipc.Server (Server.java:run(2075)) -
> IPC
> > > > Server handler 568 on 8020, call
> > > > org.apache.hadoop.hdfs.protocol.ClientProtocol.updateBlockForPipeline
> > > from
> > > > 10.134.101.118:47094 Call#309388 Retry#0: error:
> java.io.IOException:
> > > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867*
> does
> > > not
> > > > exist or is not under Constructionnull
> > > > java.io.IOException: BP-898918553-10.134.101.112-1393904898674:
> > > > *blk_1078908599_5176867* does not exist or is not under
> > Constructionnull
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:5526)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:5591)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:628)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:803)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59644)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> > > >         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> > > >         at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> > > >         at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> > > >         at java.security.AccessController.doPrivileged(Native Method)
> > > >         at javax.security.auth.Subject.doAs(Subject.java:396)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> > > >         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> > > > 2014-05-27 16:18:06,202 INFO  hdfs.StateChange
> > > > (FSNamesystem.java:saveAllocatedBlock(2873)) - BLOCK* allocateBlock:
> > > > /apps/hbase/data/WALs/b05.jsepc.com,60020,1400569521345/
> b05.jsepc.com
> > > > %2C60020%2C1400569521345.1401178672149.
> > > > BP-898918553-10.134.101.112-1393904898674
> > > > blk_1078908686_5176956{blockUCState=UNDER_CONSTRUCTION,
> > > > primaryNodeIndex=-1,
> > > > replicas=[ReplicaUnderConstruction[10.134.101.115:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.109:50010|RBW],
> > > > ReplicaUnderConstruction[10.134.101.107:50010|RBW]]}
> > > > 2014-05-27 16:18:06,208 ERROR security.UserGroupInformation
> > > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException
> > > > as:hadoop (auth:SIMPLE) cause:java.io.IOException:
> > > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 does
> > not
> > > > exist or is not under Constructionnull
> > > > 2014-05-27 16:18:06,208 INFO  ipc.Server (Server.java:run(2075)) -
> IPC
> > > > Server handler 586 on 8020, call
> > > > org.apache.hadoop.hdfs.protocol.ClientProtocol.updateBlockForPipeline
> > > from
> > > > 10.134.101.118:47094 Call#309390 Retry#0: error:
> java.io.IOException:
> > > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 does
> > not
> > > > exist or is not under Constructionnull
> > > > java.io.IOException:
> > > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 does
> > not
> > > > exist or is not under Constructionnull
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:5526)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:5591)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:628)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:803)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59644)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> > > >         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> > > >         at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)
> > > >         at
> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> > > >         at java.security.AccessController.doPrivileged(Native Method)
> > > >         at javax.security.auth.Subject.doAs(Subject.java:396)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> > > >         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)
> > > > 2014-05-27 16:18:06,209 ERROR security.UserGroupInformation
> > > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException
> > > > as:hadoop (auth:SIMPLE)
> > > > cause:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException:
> No
> > > > lease on /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/
> > > > b08.jsepc.com%2C60020%2C1400569539507.1401178568958: File does not
> > > exist.
> > > > Holder DFSClient_hb_rs_b08.jsepc.com
> > ,60020,1400569539507_-1768346484_29
> > > > does not have any open files.
> > > > 2014-05-27 16:18:06,210 INFO  ipc.Server (Server.java:run(2073)) -
> IPC
> > > > Server handler 415 on 8020, call
> > > > org.apache.hadoop.hdfs.protocol.ClientProtocol.getAdditionalDatanode
> > from
> > > > 10.134.101.118:47094 Call#309391 Retry#0: error:
> > > > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No
> lease
> > on
> > > > /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/
> b08.jsepc.com
> > > > %2C60020%2C1400569539507.1401178568958:
> > > > File does not exist. Holder
> > > > DFSClient_hb_rs_b08.jsepc.com,60020,1400569539507_-1768346484_29
> > > > does not have any open files.
> > > > 2014-05-27 16:18:06,210 ERROR security.UserGroupInformation
> > > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException
> > > > as:hadoop (auth:SIMPLE)
> > > > cause:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException:
> No
> > > > lease on /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/
> > > > b08.jsepc.com%2C60020%2C1400569539507.1401178572707: File does not
> > > exist.
> > > > Holder DFSClient_hb_rs_b08.jsepc.com
> > ,60020,1400569539507_-1768346484_29
> > > > does not have any open files.
> > > > 2014-05-27 16:18:06,210 INFO  ipc.Server (Server.java:run(2073)) -
> IPC
> > > > Server handler 487 on 8020, call
> > > > org.apache.hadoop.hdfs.protocol.ClientProtocol.getAdditionalDatanode
> > from
> > > > 10.134.101.118:47094 Call#309389 Retry#0: error:
> > > > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No
> lease
> > on
> > > > /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/
> b08.jsepc.com
> > > > %2C60020%2C1400569539507.1401178572707:
> > > > File does not exist. Holder
> > > > DFSClient_hb_rs_b08.jsepc.com,60020,1400569539507_-1768346484_29
> > > > does not have any open files.
> > > > 2014-05-27 16:18:06,210 INFO  hdfs.StateChange
> > > > (FSNamesystem.java:fsync(3471)) - BLOCK* fsync:
> /apps/hbase/data/WALs/
> > > > b05.jsepc.com,60020,1400569521345/b05.jsepc.com
> > > > %2C60020%2C1400569521345.1401178672149
> > > > for DFSClient_hb_rs_b05.jsepc.com,60020,1400569521345_32850186_29
> > > >
> > > >
> > > > Was the fact that one HDFS block does not exist that caused the RS to
> > > > crash? Why would a block missing?
> > > >
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message