hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-5751) Remove the FsDatasetSpi and FsVolumeImpl interfaces
Date Fri, 10 Jan 2014 00:18:50 GMT
Arpit Agarwal created HDFS-5751:

             Summary: Remove the FsDatasetSpi and FsVolumeImpl interfaces
                 Key: HDFS-5751
                 URL: https://issues.apache.org/jira/browse/HDFS-5751
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: datanode, test
    Affects Versions: 3.0.0
            Reporter: Arpit Agarwal

The in-memory block map and disk interface portions of the DataNode have been abstracted out
into an {{FsDatasetpSpi}} interface, which further uses {{FsVolumeSpi}} to represent individual

The abstraction is useful as it allows DataNode tests to use a {{SimulatedFSDataset}} which
does not write any data to disk. Instead it just stores block metadata in memory and returns
blank data for all reads. This is useful for both unit testing and for simulating arbitrarily
large datanodes without having to provision real disk capacity.

A 'real' DataNode uses {{FsDataSetImpl}}. Both {{FsDatasetImpl}} and {{SimulatedFSDataset}}
implement {{FsDatasetSpi}}.

However there are a few problems with this approach:
# Using the factory class significantly complicates the code flow for the common case. This
makes the code harder to understand and debug.
# There is additional burden of maintaining two different dataset implementations.
# Fidelity between the two implementations is poor.

Instead we can get eliminate the SPIs and just hide the disk read/write routines with a dependency
injection framework like Google Guice.

This message was sent by Atlassian JIRA

View raw message