hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7717) Erasure Coding: distribute replication to EC conversion work to DataNode
Date Thu, 08 Dec 2016 04:28:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15731052#comment-15731052

Andrew Wang commented on HDFS-7717:

Thanks for the writeup Sammi, a few thoughts:

* Regarding proposal #2, as Jing said above, even for tools like the Mover, the DNs are still
the ones doing the work. The client is telling DNs where to read and write data, and does
not move data itself. We still need to spread the encoding/striping work across the DNs for
EC conversion.
* Given that there's the SPS work to move the balancer/mover into the NN, it feels like if
we do anything, it should be in the NN too. The same reasons that I like SPS are why this
should also be in the NN.

Overall though I don't think this work is high-priority. Until we have whole-block EC, converting
from replicated to striped requires rewriting all the data, so there aren't any efficiency
gains over just using distcp. We'll know more about usecases once 3.0 goes GA and we get some
production experience.

> Erasure Coding: distribute replication to EC conversion work to DataNode
> ------------------------------------------------------------------------
>                 Key: HDFS-7717
>                 URL: https://issues.apache.org/jira/browse/HDFS-7717
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Jing Zhao
>            Assignee: SammiChen
> In *stripping* erasure coding case, we need some approach to distribute conversion work
between replication and stripping erasure coding to DataNode. It can be NameNode, or a tool
utilizing MR just like the current distcp, or another one like the balancer/mover. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message