hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virajith Jalaparti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10636) Modify ReplicaInfo to remove the assumption that replica metadata and data are stored in java.io.File.
Date Fri, 19 Aug 2016 01:50:21 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427479#comment-15427479

Virajith Jalaparti commented on HDFS-10636:

Hi [~eddyxu], 

bq. E.g., If you implement recovery logic in Azure or S3, the input should not be a File based
Good point. I agree with this. Having {{StorageLocation}} instead of {{File}} makes sense
(other parts of the patch aim to already do this). I will make the necessary changes. 

bq. Would that still be the case for Azure? 
For HDFS-9806, the idea would be to have a {{ProvidedReplica}} which will be used to refer
to data in external storages. As shown in HDFS-10675, this class would implement {{ReplicaInfo}}
and {{ReplicaInPipeline}}, so that we can re-use the existing code for building replication
pipelines and go through the block life cycle. 

bq. so I guess many vendors might expect using {{FinalizedReplica}} to directly represent
the block data.
Are you suggesting this in the context of HDFS-9806? If so, I think they should be using {{ProvidedReplica}}
(as mentioned in my previous point). If not, are there existing cases where vendors might
be using {{FinalizedReplica}} for non-{{File}} backed data?

One way to deal with this issue with {{FinalizedReplica}} and the usage of class types is
to have abstract classes for each replica type, and then have implementations for local and
provided replicas. This will end up having a lot of classes which are essentially {{ReplicaInfo}}.
Having a good test coverage seems a cleaner way to solve this. Thoughts? 

> Modify ReplicaInfo to remove the assumption that replica metadata and data are stored
in java.io.File.
> ------------------------------------------------------------------------------------------------------
>                 Key: HDFS-10636
>                 URL: https://issues.apache.org/jira/browse/HDFS-10636
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, fs
>            Reporter: Virajith Jalaparti
>            Assignee: Virajith Jalaparti
>         Attachments: HDFS-10636.001.patch, HDFS-10636.002.patch, HDFS-10636.003.patch,
HDFS-10636.004.patch, HDFS-10636.005.patch
> Replace java.io.File related APIs from {{ReplicaInfo}}, and enable the definition of
new {{ReplicaInfo}} sub-classes whose metadata and data can be present on external storages

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message