hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5194) Robust support for alternate FsDatasetSpi implementations
Date Tue, 19 Nov 2013 22:09:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826996#comment-13826996

Eli Collins commented on HDFS-5194:

Notes from the call:
- Attendees: Dave Powell, Eric Sirianni, Andrew Wang, Eli Collins
- Scope here is non-file storage for DataNode storage, specifically a subset of DataNode storage
for storing HDFS blocks given that parts of the data directory (eg MD) are managed via DataStorage
which is not covered here. We could make DataStorage pluggable in the future as well indepdent
of this, would probably require shuffling functionality that plugins would want to share outside
- FsDatasetSpi is currently private, we need to come up with an API (for the Spi and the classes
it returns) that could be declared stable so that users would not have to maintain different
plugins for subsequent 2.x releases.
- Would help to have a dummy plugin to help articulate what interfaces are public and catch
API and semantic breakages. Also a potential place for plugin authors to share code. Maintaining
a functional dummy plugin is expensive so might make more sense to start with something that's
compile only.
- Currently there is functionality in the FsDataset implementations that could be shared across
plugins that could be moved outside and would decrease the effort required to plug out FsDataset
and make it easier to maintain semantic compatibility.
- Pluggability is currently DataNode wide, it might make sense to have the ability to specify
the plugin on a per-volume basis for example due to wanting different plugins for different
types of storage (HDFS-2832).
- Should look into replacing standard java IO classes with Hadoop specific classes in the
relevant FsDataSet APIs since they have baked in assumptions around file-based storage and
interface baggage
- Next step is to breakdown the HDFS-5194 proposal into sub-tasks and hash out each patch
individually. Perhaps create a feature branch if there are sufficiently many patches that
need to stay out of trunk.

> Robust support for alternate FsDatasetSpi implementations
> ---------------------------------------------------------
>                 Key: HDFS-5194
>                 URL: https://issues.apache.org/jira/browse/HDFS-5194
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, hdfs-client
>            Reporter: David Powell
>            Priority: Minor
>         Attachments: HDFS-5194.design.09112013.pdf, HDFS-5194.patch.09112013
> The existing FsDatasetSpi interface is well-positioned to permit extending Hadoop to
run natively on non-traditional storage architectures.  Before this can be done, however,
a number of gaps need to be addressed.  This JIRA documents those gaps, suggests some solutions,
and puts forth a sample implementation of some of the key changes needed.

This message was sent by Atlassian JIRA

View raw message