hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virajith Jalaparti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9809) Abstract implementation-specific details from the datanode
Date Thu, 28 Apr 2016 21:58:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263105#comment-15263105
] 

Virajith Jalaparti commented on HDFS-9809:
------------------------------------------

I am attaching a new patch which includes the following changes in addition to the ones posted
earlier. (We also ported the earlier changes to a more recent version of trunk). 

# Using {{ReplicaInfo.getState()}} to get the state of a {{ReplicaInfo}} instead of using
{{instanceof}}. A related change is to use the class {{ReplicaInfo}} to refer to the replica
objects instead of the particular subclass (this required adding additional abstract functions
to the {{ReplicaInfo}} class). 
# Addition of a {{ReplicaBuilder}} and replacing calls to the constructors of different {{ReplicaInfo}}
subclasses ({{ReplicaInPipeline}}, {{ReplicaBeingWritten}}, etc.) with calls to the {{ReplicaBuilder}}
with the appropriate parameters ({{ReplicaState}}, {{blockId}} etc.) set. 
# Addition of a {{FsVolumeImplBuilder}} and replacing calls to the constructor of {{FsVolumeImpl}}
with those to the builder. 

The idea behind the changes in (1) and (2) above is to add a new {{ProvidedReplica}} class
(an implementation of {{ReplicaInfo}}) which can be: 
(a) used to represent replicas stored in a provided storage (described in more detail in the
design documentation of HDFS-9806).
(b) treated as any other {{ReplicaInfo}} in the rest of the code. This would avoid changes
to the rest of the Datanode as part of HDFS-9806. 
(c) written to using the existing replication pipeline, without implementing a separate write
pipeline for HDFS-9806. 

The idea behind (3) is to construct the appropriate volume based on the {{StorageLocation}}
(specified using {{dfs.datanode.data.dir}}). For example, as part of HDFS-9806, if a {{StorageLocation}}
is of a PROVIDED type, we would construct a {{ProvidedVolumeImpl}}. Otherwise, a {{FsVolumeImpl}}
would be built. 

Each location is opaque, and resolved by its volume. This allows the Datanode to serve data
from volumes that are not local filesystems.

> Abstract implementation-specific details from the datanode
> ----------------------------------------------------------
>
>                 Key: HDFS-9809
>                 URL: https://issues.apache.org/jira/browse/HDFS-9809
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: datanode, fs
>            Reporter: Virajith Jalaparti
>            Assignee: Virajith Jalaparti
>         Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch
>
>
> Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) implicitly
assume that blocks are stored in java.io.File(s) and that volumes are divided into directories.
We propose to abstract these details, which would help in supporting other storages. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message