hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-2049) distcp does not fail if source directory has files with missing blocks
Date Wed, 24 Oct 2007 18:18:51 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Owen O'Malley resolved HADOOP-2049.
-----------------------------------

       Resolution: Duplicate
    Fix Version/s: 0.15.0
         Assignee: Chris Douglas

This was fixed as part of HADOOP-2048.

> distcp does not fail if source directory has files with missing blocks
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-2049
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2049
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.15.0
>         Environment: Nightly build: Oct 11, 2007.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>             Fix For: 0.15.0
>
>
> I copied a directory using distcp (to another directory on the same file system).
> There were 9 data blocks missing in the files in the source directory, which caused distcp
to print messages like the following:
> ...
> 07/10/13 00:09:16 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/13 00:09:16 INFO mapred.JobClient: Task Id : task_200710120717_0081_m_000020_0,
Status : FAILED
> java.io.IOException: Could not obtain block: blk_6787282547149034655 file=/srcdir/file1
>         at org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1136)
>         at org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:988)
>         at org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1094)
>         at java.io.DataInputStream.read(DataInputStream.java:83)
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:289)
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:348)
>         at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:216)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1753)
> ...
> The corresponding tasks failed, but the retries were successful (all files with missing
blocks in the source directory were copied as empty files in the target directory).
> I think that distcp should fail if it cannot successfully copy all the files (at least
when no command-line options are given).
> This is critical for us as we intend to use distcp to copy databases from one dfs to
another, and if silent failures can happen then we would have to monitor each distcp manually
to ensure that it succeeded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message