hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1491) After successful distcp, couple of checksum error files
Date Thu, 14 Jun 2007 20:55:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12504914
] 

Raghu Angadi commented on HADOOP-1491:
--------------------------------------

My impression from looking at the one case of Koji's investigation:

Two files involved: A and B. On the source side of distcp both are fine. On the destination
side A (A_dest) is fine. B_dest is corrupted. .B_dest.crc is same as .B_src.crc, but B_dest
has the same content as A_src. Both A and B are small have only one block. Looks like while
writing B_dest, it some how wrote block corresponding to A. 

One possible bug that can result in this situation is HADOOP-1396. If both A_dest and B_dest
were created around the same time, then it is even more likely culprit (we can check the creation
times from creation times of the blocks).


> After successful distcp, couple of checksum error files
> -------------------------------------------------------
>
>                 Key: HADOOP-1491
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1491
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.12.3
>            Reporter: Koji Noguchi
>
> Tried copying 700,000 files  with distcp. 8 mappers per node.  Single dfs.client.buffer.dir.
> Distcp ran on 25 nodes mapreduce.
> Couple of tasks failed, but job was successful. 
> When checked, 12  files were corrupted. (Checksum error)
> This is repeatable.
> I'll add more information as we find.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message