hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4523) Fix concat for snapshots
Date Mon, 25 Feb 2013 20:34:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586259#comment-13586259
] 

Tsz Wo (Nicholas), SZE commented on HDFS-4523:
----------------------------------------------

> ..., that that snapshot should be modified to include the final concat'ed file ...

That's not correct.  The final concat'ed file won't be in the previously created snapshots.
 Only the transient files are removed.

Concat is a very special operation in HDFS and it has a lot of restrictions such as the original
file sizes must be multiple of block size, they must have the same replication factor, the
must be in the same directories.  We use those characteristic for identifying that those files
are transient files for concat.
                
> Fix concat for snapshots
> ------------------------
>
>                 Key: HDFS-4523
>                 URL: https://issues.apache.org/jira/browse/HDFS-4523
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>         Attachments: h4523_20130222.patch, h4523_20130223.patch, h4523_20130225.patch
>
>
> The use case of concat is for copying large files across clusters using the following
steps.
> - Step 1: The blocks of a file in the source cluster are copied in parallel to transient
files in the destination cluster.
> - Step 2: Then the transient files in the destination cluster are concatenated in order
to obtain the original file.
> If a snapshot is taken in the destination cluster before Step 2, some transient files
may be captured in the snapshot.  These transient files should be removed in Step 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message