Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 411C5100A1 for ; Wed, 28 May 2014 00:54:09 +0000 (UTC) Received: (qmail 96385 invoked by uid 500); 28 May 2014 00:54:07 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 96324 invoked by uid 500); 28 May 2014 00:54:07 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 96316 invoked by uid 99); 28 May 2014 00:54:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 May 2014 00:54:07 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of xiaotao.cs.nju@gmail.com designates 209.85.192.51 as permitted sender) Received: from [209.85.192.51] (HELO mail-qg0-f51.google.com) (209.85.192.51) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 May 2014 00:54:03 +0000 Received: by mail-qg0-f51.google.com with SMTP id q107so15804559qgd.10 for ; Tue, 27 May 2014 17:53:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ri3TBuHGsMojvWVdUafGIXR4kjfxUXu0FI6qadU01co=; b=u9PMPG/ogfZ1szbTl2rtklsioSlMqzlmgvvmN1CMgkQDlkRTQkxFj+xI32ZBGpzskb 5Tt7959oqBgtsCyEG8PSrmFXLUpkkLd1HKifta6orpCGUeTACJK2ZlBDF/aknzYar10N wryE9INBhYit6KQ+pD22CqPKeXT9J30wJCf9G97jVKGtZMa9Ap4tmiwIrZ2dA/GTpL2j MWPh7AFQMBs5+Eu2/+Qn9JjcETMPOxYjgT9ytNd1drhu4zljIsqZXpdL1YWlKzdBfdd0 LTt2mVW/oDCP85Dy3X2uIBVYKcuBesBaQL6YqrjrXanyYkYC3ociNytYTJ3rAkE7qVQd 5UJQ== MIME-Version: 1.0 X-Received: by 10.224.97.71 with SMTP id k7mr37929240qan.84.1401238422783; Tue, 27 May 2014 17:53:42 -0700 (PDT) Received: by 10.96.160.39 with HTTP; Tue, 27 May 2014 17:53:42 -0700 (PDT) In-Reply-To: References: Date: Wed, 28 May 2014 08:53:42 +0800 Message-ID: Subject: Re: Some block can not be found while inserting data into HBase From: Tao Xiao To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001a1133d34a6f9de904fa6b3efd X-Virus-Checked: Checked by ClamAV on apache.org --001a1133d34a6f9de904fa6b3efd Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I=E2=80=98m using HDP 2.0.6 2014-05-28 0:03 GMT+08:00 Ted Yu : > What hbase / hadoop release are you using ? > > Cheers > > > On Tue, May 27, 2014 at 4:25 AM, Tao Xiao > wrote: > > > I put massive records into HBase and found that one of the region serve= rs > > crashed. I checked the RS log and NameNode log and found them complaini= ng > > that some block does not exist. > > > > For example: > > > > *In RS's log:* > > java.io.IOException: Bad response ERROR for block > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 from > > datanode 10.134.101.110:50010 > > at > > > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run= (DFSOutputStream.java:732) > > 2014-05-27 16:18:06,184 WARN [ResponseProcessor for block > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867*] > > hdfs.DFSClient: DFSOutputStream ResponseProcessor exce ption for > block > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867* > > java.io.EOFException: Premature EOF: no length prefix available > > at > > > org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:149= 2) > > at > > > > > org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(Pipel= ineAck.java:116198 > > > > at > > > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run= (DFSOutputStream.java:721) > > 2014-05-27 16:18:06,184 WARN [DataStreamer for file > > /apps/hbase/data/WALs/ > > b08.jsepc.com,60020,1400569539507/b08.jsepc.com > > %2C60020%2C1400569539507.1401178572707 > > block BP-898918553 > > -10.134.101.112-1393904898674:blk_1078908601_5176869] hdfs.DFSClient: > Error > > Recovery for block > > BP-898918553-10.134.101.112-1393904898674:blk_1078908601_5176869 in > > pipeline 10.134. 101.118:50010, 10.134.101.102:50010, > > 10.134.101.104:50010: bad datanode 10.134.101.104:50010 > > > > 2014-05-27 16:18:06,184 WARN [DataStreamer for file > /apps/hbase/data/WALs/ > > b08.jsepc.com,60020,1400569539507/b08.jsepc.com > > %2C60020%2C1400569539507.1401178568958 > > block BP-898918553 > > -10.134.101.112-1393904898674:blk_1078908602_5176870] hdfs.DFSClient: > > DataStreamer Exception > > java.io.IOException: Broken pipe > > at sun.nio.ch.FileDispatcher.write0(Native Method) > > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29) > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:69) > > at sun.nio.ch.IOUtil.write(IOUtil.java:40) > > at > sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334) > > at > > > > > org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStr= eam.java:63) > > at > > > > > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:1= 42) > > at > > > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:15= 9) > > at > > > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:11= 7) > > at > > java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > > at java.io.DataOutputStream.write(DataOutputStream.java:90) > > at > > > > > org.apache.hadoop.hdfs.DFSOutputStream$Packet.writeTo(DFSOutputStream.jav= a:278) > > at > > > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.j= ava:568) > > 2014-05-27 16:18:06,188 WARN [DataStreamer for file > > /apps/hbase/data/WALs/ > > b08.jsepc.com,60020,1400569539507/b08.jsepc.com > > %2C60020%2C1400569539507.1401178568958 > > block BP-898918553 > > -10.134.101.112-1393904898674:blk_1078908602_5176870] hdfs.DFSClient: > Error > > Recovery for block > > BP-898918553-10.134.101.112-1393904898674:blk_1078908602_5176870 in > > pipeline 10.134. 101.118:50010, 10.134.101.108:50010, > > 10.134.101.105:50010: bad datanode 10.134.101.108:50010 > > 225 2014-05-27 16:18:06,184 INFO [284848237@qtp-396793761-1 - > Acceptor0 > > SelectChannelConnector@0.0.0.0:60030] mortbay.log: > > org.mortbay.io.nio.SelectorManager$SelectSet@754c6090 JVM B UG(s) - > > recreating selector 17 times, canceled keys 295 times > > 226 2014-05-27 16:18:06,209 FATAL [regionserver60020] > > regionserver.HRegionServer: ABORTING region server > > b08.jsepc.com,60020,1400569539507: > > org.apache.hadoop.hbase.YouAreDeadException : Server REPORT rejecte= d; > > currently processing b08.jsepc.com,60020,1400569539507 as dead server > > at > > > > > org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.ja= va:341) > > at > > > > > org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerMan= ager.java:254) > > at > > > > > org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:13= 42) > > at > > > > > org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$Regio= nServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:508= 7) > > at > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175) > > at > > org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879) > > > > org.apache.hadoop.hbase.YouAreDeadException: > > org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; > > currently processing b08.jsepc.com,60020,1400569539507 as dead server > > > > > > *In namenode's log:* > > 2014-05-27 16:18:04,593 INFO BlockStateChange > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: > > blockMap updated: 10.134.101.114:50010 is added to > > blk_1078908684_5176954{blockUCState=3DUNDER_CONSTRUCTION, > > primaryNodeIndex=3D-1, > > replicas=3D[ReplicaUnderConstruction[10.134.101.107:50010|RBW], > > ReplicaUnderConstruction[10.134.101.114:50010|RBW], > > ReplicaUnderConstruction[10.134.101.119:50010|RBW]]} size 0 > > 2014-05-27 16:18:04,593 INFO BlockStateChange > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: > > blockMap updated: 10.134.101.107:50010 is added to > > blk_1078908684_5176954{blockUCState=3DUNDER_CONSTRUCTION, > > primaryNodeIndex=3D-1, > > replicas=3D[ReplicaUnderConstruction[10.134.101.107:50010|RBW], > > ReplicaUnderConstruction[10.134.101.114:50010|RBW], > > ReplicaUnderConstruction[10.134.101.119:50010|RBW]]} size 0 > > 2014-05-27 16:18:04,596 INFO hdfs.StateChange > > (FSNamesystem.java:completeFile(2814)) - DIR* completeFile: > > > > > /apps/hbase/data/data/default/Test-QSH-ARCHIVES/4baa021e822ca4843d64af2e3= 641deab/.tmp/2d251bb78829442ea22a8031e58721a0 > > is closed by DFSClient_hb_rs_a07.jsepc.com > > ,60020,1400569539807_-1204612826_29 > > 2014-05-27 16:18:04,876 INFO BlockStateChange > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: > > blockMap updated: 10.134.101.108:50010 is added to > > blk_1078908671_5176941{blockUCState=3DUNDER_CONSTRUCTION, > > primaryNodeIndex=3D-1, > > replicas=3D[ReplicaUnderConstruction[10.134.101.120:50010|RBW], > > ReplicaUnderConstruction[10.134.101.105:50010|RBW], > > ReplicaUnderConstruction[10.134.101.108:50010|RBW]]} size 0 > > 2014-05-27 16:18:04,877 INFO BlockStateChange > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: > > blockMap updated: 10.134.101.105:50010 is added to > > blk_1078908671_5176941{blockUCState=3DUNDER_CONSTRUCTION, > > primaryNodeIndex=3D-1, > > replicas=3D[ReplicaUnderConstruction[10.134.101.120:50010|RBW], > > ReplicaUnderConstruction[10.134.101.105:50010|RBW], > > ReplicaUnderConstruction[10.134.101.108:50010|RBW]]} size 0 > > 2014-05-27 16:18:04,878 INFO BlockStateChange > > (BlockManager.java:logAddStoredBlock(2237)) - BLOCK* addStoredBlock: > > blockMap updated: 10.134.101.120:50010 is added to > > blk_1078908671_5176941{blockUCState=3DUNDER_CONSTRUCTION, > > primaryNodeIndex=3D-1, > > replicas=3D[ReplicaUnderConstruction[10.134.101.120:50010|RBW], > > ReplicaUnderConstruction[10.134.101.105:50010|RBW], > > ReplicaUnderConstruction[10.134.101.108:50010|RBW]]} size 0 > > 2014-05-27 16:18:04,880 INFO hdfs.StateChange > > (FSNamesystem.java:completeFile(2814)) - DIR* completeFile: > > > > > /apps/hbase/data/data/default/Test-QSH-ARCHIVES/872cb2b3a7c3b0ef9c3982abe= a565329/.tmp/e2b00b2387124e79a87858afc21024fc > > is closed by DFSClient_hb_rs_b10.jsepc.com > > ,60020,1400569547596_-57117285_29 > > 2014-05-27 16:18:06,193 ERROR security.UserGroupInformation > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException > > as:hadoop (auth:SIMPLE) cause:java.io.IOException: > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867* does > not > > exist or is not under Constructionnull > > 2014-05-27 16:18:06,193 INFO ipc.Server (Server.java:run(2075)) - IPC > > Server handler 568 on 8020, call > > org.apache.hadoop.hdfs.protocol.ClientProtocol.updateBlockForPipeline > from > > 10.134.101.118:47094 Call#309388 Retry#0: error: java.io.IOException: > > BP-898918553-10.134.101.112-1393904898674:*blk_1078908599_5176867* does > not > > exist or is not under Constructionnull > > java.io.IOException: BP-898918553-10.134.101.112-1393904898674: > > *blk_1078908599_5176867* does not exist or is not under Constructionnul= l > > at > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesy= stem.java:5526) > > at > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipelin= e(FSNamesystem.java:5591) > > at > > > > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPi= peline(NameNodeRpcServer.java:628) > > at > > > > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTransla= torPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.j= ava:803) > > at > > > > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Client= NamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:596= 44) > > at > > > > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Pr= otobufRpcEngine.java:585) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation= .java:1491) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047) > > 2014-05-27 16:18:06,202 INFO hdfs.StateChange > > (FSNamesystem.java:saveAllocatedBlock(2873)) - BLOCK* allocateBlock: > > /apps/hbase/data/WALs/b05.jsepc.com,60020,1400569521345/b05.jsepc.com > > %2C60020%2C1400569521345.1401178672149. > > BP-898918553-10.134.101.112-1393904898674 > > blk_1078908686_5176956{blockUCState=3DUNDER_CONSTRUCTION, > > primaryNodeIndex=3D-1, > > replicas=3D[ReplicaUnderConstruction[10.134.101.115:50010|RBW], > > ReplicaUnderConstruction[10.134.101.109:50010|RBW], > > ReplicaUnderConstruction[10.134.101.107:50010|RBW]]} > > 2014-05-27 16:18:06,208 ERROR security.UserGroupInformation > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException > > as:hadoop (auth:SIMPLE) cause:java.io.IOException: > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 does n= ot > > exist or is not under Constructionnull > > 2014-05-27 16:18:06,208 INFO ipc.Server (Server.java:run(2075)) - IPC > > Server handler 586 on 8020, call > > org.apache.hadoop.hdfs.protocol.ClientProtocol.updateBlockForPipeline > from > > 10.134.101.118:47094 Call#309390 Retry#0: error: java.io.IOException: > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 does n= ot > > exist or is not under Constructionnull > > java.io.IOException: > > BP-898918553-10.134.101.112-1393904898674:blk_1078908600_5176868 does n= ot > > exist or is not under Constructionnull > > at > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesy= stem.java:5526) > > at > > > > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipelin= e(FSNamesystem.java:5591) > > at > > > > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPi= peline(NameNodeRpcServer.java:628) > > at > > > > > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTransla= torPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.j= ava:803) > > at > > > > > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$Client= NamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:596= 44) > > at > > > > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Pr= otobufRpcEngine.java:585) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation= .java:1491) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047) > > 2014-05-27 16:18:06,209 ERROR security.UserGroupInformation > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException > > as:hadoop (auth:SIMPLE) > > cause:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No > > lease on /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/ > > b08.jsepc.com%2C60020%2C1400569539507.1401178568958: File does not > exist. > > Holder DFSClient_hb_rs_b08.jsepc.com,60020,1400569539507_-1768346484_29 > > does not have any open files. > > 2014-05-27 16:18:06,210 INFO ipc.Server (Server.java:run(2073)) - IPC > > Server handler 415 on 8020, call > > org.apache.hadoop.hdfs.protocol.ClientProtocol.getAdditionalDatanode fr= om > > 10.134.101.118:47094 Call#309391 Retry#0: error: > > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease = on > > /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/b08.jsepc.com > > %2C60020%2C1400569539507.1401178568958: > > File does not exist. Holder > > DFSClient_hb_rs_b08.jsepc.com,60020,1400569539507_-1768346484_29 > > does not have any open files. > > 2014-05-27 16:18:06,210 ERROR security.UserGroupInformation > > (UserGroupInformation.java:doAs(1494)) - PriviledgedActionException > > as:hadoop (auth:SIMPLE) > > cause:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No > > lease on /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/ > > b08.jsepc.com%2C60020%2C1400569539507.1401178572707: File does not > exist. > > Holder DFSClient_hb_rs_b08.jsepc.com,60020,1400569539507_-1768346484_29 > > does not have any open files. > > 2014-05-27 16:18:06,210 INFO ipc.Server (Server.java:run(2073)) - IPC > > Server handler 487 on 8020, call > > org.apache.hadoop.hdfs.protocol.ClientProtocol.getAdditionalDatanode fr= om > > 10.134.101.118:47094 Call#309389 Retry#0: error: > > org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease = on > > /apps/hbase/data/WALs/b08.jsepc.com,60020,1400569539507/b08.jsepc.com > > %2C60020%2C1400569539507.1401178572707: > > File does not exist. Holder > > DFSClient_hb_rs_b08.jsepc.com,60020,1400569539507_-1768346484_29 > > does not have any open files. > > 2014-05-27 16:18:06,210 INFO hdfs.StateChange > > (FSNamesystem.java:fsync(3471)) - BLOCK* fsync: /apps/hbase/data/WALs/ > > b05.jsepc.com,60020,1400569521345/b05.jsepc.com > > %2C60020%2C1400569521345.1401178672149 > > for DFSClient_hb_rs_b05.jsepc.com,60020,1400569521345_32850186_29 > > > > > > Was the fact that one HDFS block does not exist that caused the RS to > > crash? Why would a block missing? > > > --001a1133d34a6f9de904fa6b3efd--