hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinsong Hu" <jinsong...@hotmail.com>
Subject Re: Fw: hadoop data loss issue discovered. please fix!
Date Fri, 09 Jul 2010 16:37:20 GMT
Hi, Todd:
  I tried it from scratch, It is still not good. Currently I am using yahoo 
hadoop 0.20.9 and hbase 0.20.5. I just have too much trouble with them. I am 
thinking of using a supported and blessed version of hadoop and hbase 
combination. What is the most stable combination of the 2 ?
  When I send enough data to hbase,  region server timeout while read/write 
data. and finally the regionserver shut down itself. I have spent too much 
time on this and I am thinking of distribution.

Jinsong


--------------------------------------------------
From: "Todd Lipcon" <todd@cloudera.com>
Sent: Friday, May 21, 2010 11:14 PM
To: <hdfs-dev@hadoop.apache.org>
Subject: Re: Fw: hadoop data loss issue discovered. please fix!

> Hi Jinsong,
>
> I don't see any data loss here.
>
> The sequence of events from the logs:
>
> ==> NN allocates block:
> 2010-05-18 21:21:29,731 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.allocateBlock: 
> /hbase/.META./1028785192/info/656097411976846533.
> blk_5636039758999247483_31304886
>
> ===> First DN reports it has received block
> 2010-05-18 21:21:29,913 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.110.8.63:50010 is added to
> blk_5636039758999247483_31304886 size 441
>
> ===> Client calls completeFile
> 2010-05-18 21:21:29,913 INFO org.apache.hadoop.hdfs.StateChange: DIR*
> NameSystem.completeFile: file
> /hbase/.META./1028785192/info/656097411976846533 is closed by
> DFSClient_-919320526
>
> ===> 2nd and 3rd DN have not yet heartbeated since receiving the block, so
> replication count is low, and unnecessary replication is scheduled. This 
> is
> a known issue - I was actually meaning to file a JIRA about it this week.
> 2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> ask
> 10.110.8.63:50010 to replicate blk_5636039758999247483_31304886 to
> datanode(s) 10.110.8.86:50010 10.110.8.69:50010
>
> ===> Other DNs check in (within 4-5 seconds of file completion, which is
> reasonable heartbeat time)
> 2010-05-18 21:21:33,156 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.110.8.69:50010 is added to
> blk_5636039758999247483_31304886 size 441
> 2010-05-18 21:21:34,413 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.110.8.59:50010 is added to
> blk_5636039758999247483_31304886 size 441
>
> ===> 8 seconds later the first replication goes through and cleanup of
> excess replicas happens
> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.110.8.86:50010 is added to
> blk_5636039758999247483_31304886 size 441
> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.chooseExcessReplicates: (10.110.8.63:50010,
> blk_5636039758999247483_31304886) is added to recentInvalidateSets
> 2010-05-18 21:21:43,995 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> ask
> 10.110.8.63:50010 to delete  blk_5636039758999247483_31304886
> blk_4349310048904429157_31304519
>
> ===> another 14 seconds later, the other replication goes through and
> another excess is invalidated
> 2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addStoredBlock: blockMap updated: 10.110.8.85:50010 is added to
> blk_5636039758999247483_31304886 size 441
> 2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.chooseExcessReplicates: (10.110.8.69:50010,
> blk_5636039758999247483_31304886) is added to recentInvalidateSets
>
> ===> about 5 minutes later, the regionserver performs a compaction and 
> asks
> the NN to delete this file
> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addToInvalidates: blk_5636039758999247483 is added to 
> invalidSet
> of 10.110.8.63:50010
> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addToInvalidates: blk_5636039758999247483 is added to 
> invalidSet
> of 10.110.8.69:50010
> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addToInvalidates: blk_5636039758999247483 is added to 
> invalidSet
> of 10.110.8.59:50010
> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addToInvalidates: blk_5636039758999247483 is added to 
> invalidSet
> of 10.110.8.86:50010
> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> NameSystem.addToInvalidates: blk_5636039758999247483 is added to 
> invalidSet
> of 10.110.8.85:50010
> 2010-05-18 21:26:39,389 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: 
> ugi=hadoop,hadoop
>      ip=/10.110.8.85 cmd=delete
> src=/hbase/.META./1028785192/info/656097411976846533    dst=null
> perm=null
>
> As for the errors seen in the regionserver, the issue is that it called
> open() before replication was done, and therefore only got one block
> location. When the replica was removed there, the RS should have gone back
> to the NN for more replicas. I'm not sure if the codebase you're running
> includes HDFS-445. Judging from the CHANGES.txt, it appears not. So, it's
> likely that you're hitting this bug where DFSClient wouldn't refetch block
> locations from the NN for positional reads.
>
> Thanks
> -Todd
>
>
> On Fri, May 21, 2010 at 3:32 PM, Jinsong Hu <jinsong_hu@hotmail.com> 
> wrote:
>
>> Hi, Todd:
>>  I have cut all the logs for that 10 minutes from the 3 machines,  and if
>> you grep 5636039758999247483, you will see
>> all the logs from the regionserver, namenode, and data node. You can
>> compare
>> the time sequence.
>> all our machine's time is sync'ed via ntp, so you can compare exactly 
>> side
>> by side and show the
>> sequence of events.
>>  The build I took is from yahoo 0.20.9 distribution.
>>
>> Jinsong
>>
>> --------------------------------------------------
>> From: "Todd Lipcon" <todd@cloudera.com>
>> Sent: Friday, May 21, 2010 2:20 PM
>> To: <hdfs-dev@hadoop.apache.org>
>> Subject: Re: Fw: hadoop data loss issue discovered. please fix!
>>
>>
>>  Hi Jinsong,
>>>
>>> Could you upload a tarball of the log files somewhere from each of the 
>>> DNs
>>> and the RS involved? It's hard to trace through the log in the email 
>>> (the
>>> email added all kinds of wrapping, etc)
>>>
>>> -Todd
>>>
>>> On Fri, May 21, 2010 at 2:17 PM, Jinsong Hu <jinsong_hu@hotmail.com>
>>> wrote:
>>>
>>>  Hi, There:
>>>>  While I used hadoop 0.20.9-yahoo distribution and hbase 0.20.4 
>>>> version,
>>>> I
>>>> found that the hadoop lose blocks under certain
>>>> situation, and thus corrupt hbase tables.
>>>>
>>>>
>>>>
>>>> I compared namenode, datanode and hbase regionserver and figured out 
>>>> the
>>>> reason.
>>>>
>>>>
>>>>
>>>> The regionserver 10.110.8.85 asks namenode 10.110.8.83 to save a block,
>>>> 10.110.8.84 gives
>>>>
>>>> Multiple IP,  regionserver choose 10.110.8.63 and save the block there.
>>>> After a while,  namenode
>>>>
>>>> Asks the bock to be replicated to  10.110.8..86 and 10.110.8..69
>>>> machines.
>>>> A moment late, .86. .69 received
>>>>
>>>> The replication, but strangely,  10.110.8..59  10.110.8..85 also 
>>>> received
>>>> replicaton of the block., even though it
>>>>
>>>> Is not in the replication list.
>>>>
>>>>
>>>>
>>>> Then the chooseExcessReplicates  asks to delele excess from .63, .69 ,
>>>> thinking there are too
>>>>
>>>> Many replica.  Even though .63 was the original copy, the algorithm
>>>> choose
>>>> to delete block based
>>>>
>>>> On the amount of empty disk.  A moment later addToInvalidates ( not 
>>>> from
>>>> chooseExcessReplicates  ) asks
>>>>
>>>> The block to be deleted on .86, .85, .59.  I check the code, this can
>>>> only
>>>> be possible if
>>>>
>>>> The block is corrupted.
>>>>
>>>>
>>>>
>>>> In the end, this block doesn't exist anywhere in the cluster. And it is
>>>> permanently lost.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> namenode:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.chooseExcessReplicates: (10.110.8.63:50010,
>>>> blk_5636039758999247483_31304886)
>>>>
>>>> is added to recentInvalidateSets
>>>>
>>>> 2010-05-18 21:21:43,995 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.63:50010 to delete  blk_5636039758999247483_31304886
>>>> blk_434931004890442915
>>>>
>>>> 7_31304519
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> the block was initially added to 10.110.8.63, then replicated to
>>>>
>>>> 10.110.8.63 59 69 86 85   . subsequently, replication process
>>>> addToInvalidates
>>>>
>>>> removed all of them. the code review shows that the replicate is 
>>>> corrupt,
>>>>
>>>> and all get deleted.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:29,913 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addStoredBlock: blockMap updated: 10.110.8.63:50010 is added to
>>>> blk_563603975
>>>>
>>>> 8999247483_31304886 size 441
>>>>
>>>> 2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.63:50010 to replicate blk_5636039758999247483_31304886 to
>>>> datanode(s)
>>>> 10.11
>>>>
>>>> 0.8.86:50010 10.110.8.69:50010
>>>>
>>>> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.chooseExcessReplicates: (10.110.8.63:50010,
>>>> blk_5636039758999247483_31304886)
>>>>
>>>> is added to recentInvalidateSets
>>>>
>>>> 2010-05-18 21:21:43,995 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.63:50010 to delete  blk_5636039758999247483_31304886
>>>> blk_434931004890442915
>>>>
>>>> 7_31304519
>>>>
>>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>>> 10.110.8.
>>>>
>>>> 63:50010
>>>>
>>>> 2010-05-18 21:26:45,953 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.63:50010 to delete  blk_-1838286221287242082_31305179
>>>> blk_84467625641825134
>>>>
>>>> 17_31305143 blk_5636039758999247483_31304886
>>>> blk_4628640249731313760_31305046 bl
>>>>
>>>> k_7460947863067370701_31270225 blk_-4468681536500281247_31270225
>>>> blk_84535177111
>>>>
>>>> 01429609_31303917 blk_9126133835045521966_31303972
>>>> blk_4623110280826973929_31305
>>>>
>>>> 203 blk_-2581238696314957800_31305033 blk_7461125351290749755_31305052
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.63:50010 to replicate blk_5636039758999247483_31304886 to
>>>> datanode(s)
>>>> 10.11
>>>>
>>>> 0.8.86:50010 10.110.8.69:50010
>>>>
>>>> 2010-05-18 21:21:33,156 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addStoredBlock: blockMap updated: 10.110.8.69:50010 is added to
>>>> blk_563603975
>>>>
>>>> 8999247483_31304886 size 441
>>>>
>>>> 2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.chooseExcessReplicates: (10.110.8.69:50010,
>>>> blk_5636039758999247483_31304886)
>>>>
>>>> is added to recentInvalidateSets
>>>>
>>>> 2010-05-18 21:21:59,005 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.69:50010 to delete  blk_5636039758999247483_31304886
>>>>
>>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>>> 10.110.8.
>>>>
>>>> 69:50010
>>>>
>>>> 2010-05-18 21:26:42,951 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.69:50010 to delete  blk_-2124965527858346013_31270213
>>>> blk_-5027506345849158
>>>>
>>>> 498_31270213 blk_5636039758999247483_31304886
>>>> blk_9148821113904458973_31305189 b
>>>>
>>>> lk_4850797749721229572_31305072 blk_252039065084461924_31305031
>>>> blk_-83518367280
>>>>
>>>> 09062091_31305208 blk_-7576696059515014894_31305194
>>>> blk_-2900250119736465962_312
>>>>
>>>> 70214 blk_471700613578524871_31304950 blk_-190744003190006044_31305064
>>>> blk_72650
>>>>
>>>> 57386742001625_31305073
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:31,987 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.
>>>>
>>>> 10.8.63:50010 to replicate blk_5636039758999247483_31304886 to
>>>> datanode(s)
>>>> 10.1
>>>>
>>>> 0.8.86:50010 10.110.8.69:50010
>>>>
>>>> 2010-05-18 21:21:41,941 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSys
>>>>
>>>> em.addStoredBlock: blockMap updated: 10.110.8.86:50010 is added to
>>>> blk_56360397
>>>>
>>>> 8999247483_31304886 size 441
>>>>
>>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSys
>>>>
>>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>>> 10.110.8
>>>>
>>>> 86:50010
>>>>
>>>> 2010-05-18 21:26:42,951 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.
>>>>
>>>> 10.8.86:50010 to delete  blk_-6242136662924452584_31259201
>>>> blk_5636039758999247
>>>>
>>>> 83_31304886 blk_4850797749721229572_31305072
>>>> blk_252039065084461924_31305031
>>>> bl
>>>>
>>>> _-1317144678443645904_31305204 blk_6050185755706975664_31270230
>>>> blk_26714169718
>>>>
>>>> 5801868_31304948 blk_-5582352089328547938_31305022
>>>> blk_-3115115738671914626_312
>>>>
>>>> 0210
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:34,413 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addStoredBlock: blockMap updated: 10.110.8.59:50010 is added to
>>>> blk_563603975
>>>>
>>>> 8999247483_31304886 size 441
>>>>
>>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>>> 10.110.8.
>>>>
>>>> 59:50010
>>>>
>>>> 2010-05-18 21:26:39,950 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.59:50010 to delete  blk_5636039758999247483_31304886
>>>> blk_-45285121566353996
>>>>
>>>> 25_31305212 blk_1439789418382469336_31305158
>>>> blk_8860574934531794641_31270219 bl
>>>>
>>>> k_-8358193301564392132_31305029
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:57,835 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addStoredBlock: blockMap updated: 10.110.8.85:50010 is added to
>>>> blk_563603975
>>>>
>>>> 8999247483_31304886 size 441
>>>>
>>>> 2010-05-18 21:26:39,388 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> NameSyst
>>>>
>>>> em.addToInvalidates: blk_5636039758999247483 is added to invalidSet of
>>>> 10.110.8.
>>>>
>>>> 85:50010
>>>>
>>>> 2010-05-18 21:26:39,950 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
>>>> ask
>>>> 10.1
>>>>
>>>> 10.8.85:50010 to delete  blk_-6242136662924452584_31259201
>>>> blk_56360397589992474
>>>>
>>>> 83_31304886 blk_4628640249731313760_31305046
>>>> blk_4747588241975451642_31305123 bl
>>>>
>>>> k_-6876078628884993825_31270230 blk_-4468681536500281247_31270225
>>>> blk_7325830193
>>>>
>>>> 509411302_31270230 blk_8453517711101429609_31303917
>>>> blk_-6094734447689285387_313
>>>>
>>>> 05127 blk_3353439739797003235_31305037 
>>>> blk_-5027506345849158498_31270213
>>>> blk_148
>>>>
>>>> 4161645992497144_31270225 blk_4464987648045469454_31305144
>>>> blk_74609478630673707
>>>>
>>>> 01_31270225 blk_-1170815606945644545_31270230
>>>> blk_6050185755706975664_31270230 b
>>>>
>>>> lk_-8358193301564392132_31305029 blk_2671416971885801868_31304948
>>>> blk_5593547375
>>>>
>>>> 459437465_31286511 blk_-2581238696314957800_31305033
>>>> blk_4732635559915402193_312
>>>>
>>>> 70230 blk_-2124965527858346013_31270213 
>>>> blk_-5837992573431863412_31286612
>>>> blk_-4
>>>>
>>>> 32558447034944954_31270208 blk_-3407615138527189735_31305069
>>>> blk_886057493453179
>>>>
>>>> 4641_31270219 blk_233110856487529716_31270229
>>>> blk_312750273180273303_31270228 bl
>>>>
>>>> k_7461125351290749755_31305052 blk_-8902661185532055148_31304947
>>>> blk_-8555258258
>>>>
>>>> 738129670_31270210 blk_252039065084461924_31305031
>>>> blk_9037118763503479133_31305
>>>>
>>>> 120 blk_-8494656323754369174_31305105 blk_9126133835045521966_31303972
>>>> blk_-5582
>>>>
>>>> 352089328547938_31305022 blk_-2900250119736465962_31270214
>>>> blk_-3115115738671914
>>>>
>>>> 626_31270210 blk_7612090442234634555_31270225
>>>> blk_5876492007747505188_31270213 b
>>>>
>>>> lk_471700613578524871_31304950 blk_-190744003190006044_31305064
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> datanode 10.110.8.63:
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:46,058 INFO
>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: De
>>>>
>>>> leting block blk_5636039758999247483_31304886 file
>>>> /hadoop_data_dir/dfs/data/cur
>>>>
>>>> rent/subdir23/blk_5636039758999247483
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> hbase region server 10.110.8.85:
>>>>
>>>>
>>>>
>>>> DFSClient.java,
>>>>
>>>> DatanodeInfo chosenNode = bestNode(nodes, deadNodes);
>>>>
>>>> InetSocketAddress targetAddr =
>>>> NetUtils.createSocketAddr(chosenNode.getName());
>>>>
>>>> return new DNAddrPair(chosenNode, targetAddr);
>>>>
>>>>
>>>>
>>>> still picked the 10.110.8.63 even though the command is sent from name
>>>> node
>>>> 21:21:43,995
>>>>
>>>> to delete the block, and it was executed at 21:21:46,058 . ?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2010-05-18 21:21:46,188 WARN org.apache.hadoop.hdfs.DFSClient: Failed 
>>>> to
>>>> connect
>>>>
>>>> to /10.110.8.63:50010 for file
>>>> /hbase/.META./1028785192/info/656097411976846533
>>>>
>>>> for block 5636039758999247483:java.io.IOException: Got error in 
>>>> response
>>>> to
>>>> OP_
>>>>
>>>> READ_BLOCK for file /hbase/.META./1028785192/info/656097411976846533 
>>>> for
>>>> block 5
>>>>
>>>> 636039758999247483
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>
>
> -- 
> Todd Lipcon
> Software Engineer, Cloudera
> 

Mime
View raw message