hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Reopened: (HADOOP-2725) Distcp truncates some files when copying
Date Sat, 09 Feb 2008 03:08:08 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Douglas reopened HADOOP-2725:
-----------------------------------


I reverted this patch, because its test case (TestCopyFiles) took nearly 400s (from 26s) on
my machine, due to a silently failing local-to-local test case. All 20 files copy successfully,
but fail in the rename:

{noformat}
2008-02-08 18:36:14,246 INFO  util.CopyFiles (CopyFiles.java:map(390)) - FAIL 2522487525519213817
: java.io.IOException: Fail to rename tmp file (=file:/path/build/test/data/destdat/_distcp_tmp_cq5yoa/25224875255192138
17) to destination file (=file:/path/build/test/data/destdat/2522487525519213817)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.rename(CopyFiles.java:336)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:317)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:382)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:202)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:208)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:132)
Caused by: java.io.IOException: Target file:/path/build/test/data/destdat/.2522487525519213817.crc
already exists
        at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:269)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:133)
        at org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java:211)
        at org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:403)
        at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.rename(CopyFiles.java:333)
        ... 6 more
{noformat}

At a glance, this looks like a problem in LocalFileSystem, but I'm reverting this patch for
now.

> Distcp truncates some files when copying
> ----------------------------------------
>
>                 Key: HADOOP-2725
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2725
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs, util
>    Affects Versions: 0.16.0
>         Environment: Nightly build: http://hadoopqa.yst.corp.yahoo.com:8080/hudson/job/Hadoop-LinuxTest/770/
> With patches for HADOOP-2095 and HADOOP-2119.
>            Reporter: Murtaza A. Basrai
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Critical
>             Fix For: 0.16.1
>
>         Attachments: 2725_20080206.patch, 2725_20080208.patch
>
>
> We used distcp to copy ~100 TB of data across two clusters ~1400 nodes each.
> Command used (it was run on the src cluster):
> hadoop distcp -log /logdir/logfile hdfs://src-namenode:8600//src-dir-1 hdfs://src-namenode:8600//src-dir-2
... hdfs://src-namenode:8600//src-dir-n hdfs://tgt-namenode:8600//dst-dir
> Distcp completed without errors, but when we checked the file sizes on the src and tgt
clusters, we noticed differences in file sizes for 9 files (~6 GB).
> src-file-1 666762714 bytes -> tgt-file-1 134217728 bytes
> src-file-2 673791814 bytes -> tgt-file-2 536870912 bytes
> src-file-3 692172075 bytes -> tgt-file-3 0 bytes
> All target files are truncated at block boundaries (some have 0 size).
> I looked at the log files, and noticed a few things:
> 1. There are 31059 log files (same as the number of Maps the job had).
> 2. 246 of the log files are non-empty.
> 3. All non-empty log files are of the form:
> SKIP: hdfs://src-namenode/src-dir-a/src-file-x
> SKIP: hdfs://src-namenode/src-dir-b/src-file-y
> SKIP: hdfs://src-namenode/src-dir-c/src-file-z
> 4. All 9 files which were truncated were included in the log files as skipped files.
> 5. All 9 files were the last entry in their respective log files.
> e.g.
> Non-empty logfile 1:
> SKIP: hdfs://src-namenode/src-dir-a/src-file-x
> SKIP: hdfs://src-namenode/src-dir-b/src-file-y
> SKIP: hdfs://src-namenode/src-dir-c/src-file-z  <-- Truncated file
> Non_empty logfile 2:
> SKIP: hdfs://src-namenode/src-dir-p/src-file-m
> SKIP: hdfs://src-namenode/src-dir-q/src-file-n  <-- Truncated file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message