hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3294) distcp leaves empty blocks afte successful execution
Date Mon, 28 Apr 2008 23:55:55 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tsz Wo (Nicholas), SZE updated HADOOP-3294:
-------------------------------------------

    Attachment: 3294_20080423b_0.16.patch

3294_20080423b_0.16.patch: for 0.16

> distcp leaves empty blocks afte successful execution
> ----------------------------------------------------
>
>                 Key: HADOOP-3294
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3294
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.16.3
>         Environment: 0.16.3 without any patches. Dfs permissions turned off everywhere,
such that HADOOP-3138 and HADOOP-3186 do not apply
>            Reporter: Christian Kunz
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: 3294_20080423.patch, 3294_20080423b.patch, 3294_20080423b_0.16.patch
>
>
> I copied around 40 TB between two hadoop clusters, with distcp running on source.
> Job was *successful*, but one destination file was empty because of its only block being
empty.
> None of the distcp log files have any mentioning of this file.
> There were a couple of messages in the namenode server log of the destination cluster
referencing the file:
> hadoop-xxxnamenode-yyy.log.2008-04-19:2008-04-19 02:19:15,666 INFO org.apache.hadoop.dfs.StateChange:
BLOCK* NameSystem.allocateBlock: destinationDir/_distcp_tmp_z0g93p/fileName. blk_-9209890281741927376
> hadoop-xxx-namenode-yyy.log.2008-04-19:2008-04-19 02:54:45,820 WARN org.apache.hadoop.dfs.StateChange:
DIR* NameSystem.internalReleaseCreate: attempt to release a create lock on destinationDir/_distcp_tmp_z0g93p/fileName
file does not exist.
> distcp should not rely on the user to double-check.
> Would it make sense to add a reducer  to compare destination file sizes with source files
sizes and do some appropriate action?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message