hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8014) Erasure Coding: local and remote block reader for coding work in DataNode
Date Mon, 20 Apr 2015 21:46:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503731#comment-14503731

Zhe Zhang commented on HDFS-8014:

bq. what is the reason not to simply use DFSInputStream?
Thanks for bringing it up Nicholas. That's another option we've discussed. 

Intuitively, {{DFSInputStream}} and {{DFSOutputStream}} have some additional overhead to handle
file-level logic, and slow readers/writers etc. DN should be able to push/pull blocks more
efficiently. I haven't thoroughly looked at those overhead yet, and it's a good chance to
brainstorm here. 

To begin with, how do we create a {{DFSInputStream}} on a DN? Following the regular path,
we'll have an unnecessary RPC to NN to fetch block locations (DN should already have them
from the block reconstruction command). 

Maybe expose {{actualGetFromOneDataNode}} and just reuse that code? To do that, we still need
to create a {{DFSClient}} on the DN. Looks a little bit weird but I don't see a real downside.

> Erasure Coding: local and remote block reader for coding work in DataNode
> -------------------------------------------------------------------------
>                 Key: HDFS-8014
>                 URL: https://issues.apache.org/jira/browse/HDFS-8014
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Zhe Zhang
> As a task of HDFS-7344 ECWorker, in either stripping or non-stripping erasure coding,
to perform encoding or decoding, we need first to be able to read locally or remotely data
blocks. This is to come up block reader facility in DataNode side. Better to think about the
similar work done in client side, so in future it's possible to unify the both.

This message was sent by Atlassian JIRA

View raw message