hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10971) Distcp should not copy replication factor if source file is erasure coded
Date Thu, 06 Oct 2016 21:01:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553164#comment-15553164
] 

Andrew Wang commented on HDFS-10971:
------------------------------------

If the EC policy is showing up as the replication factor in FileStatus, that's a pretty incompatible
change for users. Particularly since I don't see an "isErasureCoded" method on FileStatus
similar to isEncrypted or getAclBit, so there's no way for the user to reasonably interpret
this value without also querying the EC policy. This affects a lot more than just distcp.

We should consider squashing this and showing a dummy replication value like "1" instead for
compatibility. Then, having an "isErasureCoded" method and perhaps a way of querying the policy.
This is sort of like getAclBit, where there's getShort and toExtendedShort, with the former
masking out the ACL bit.

[~zhz] / [~jingzhao] / [~drankye], thoughts welcomed.

> Distcp should not copy replication factor if source file is erasure coded
> -------------------------------------------------------------------------
>
>                 Key: HDFS-10971
>                 URL: https://issues.apache.org/jira/browse/HDFS-10971
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: distcp
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-10971.testcase.patch
>
>
> The current erasure coding implementation uses replication factor field to store erasure
coding policy.
> Distcp copies the source file's replication factor to the destination if {{-pr}} is specified.
However, if the source file is EC, the replication factor (which is EC policy) should not
be replicated to the destination file. When a HdfsFileStatus is converted to FileStatus, the
replication factor is set to 0 if it's an EC file.
> In fact, I will attach a test case that shows trying to replicate the replication factor
of an EC file results in an IOException: "Requested replication factor of 0 is less than the
required minimum of 1 for /tmp/dst/dest2"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message