hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system
Date Sat, 13 Feb 2016 01:37:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15145688#comment-15145688

Chris Douglas commented on HDFS-9806:

Obviously, we won't map all the semantics of HDFS to arbitrary storage systems. Demonstrating
that the state reported by HDFS is durable, visibility data across systems is consistently
ordered, etc. would require deeper coordination than is practical. So we make a few simplifying
assumptions about the analytic workloads that are common in practice.

In the model this proposes, an HDFS replica may be "provided" by storage accessible through
multiple datanodes. This is exposed to HDFS as a storage tier, as a peer to other, locally-managed
storage media (e.g., SSD). A replica stored in provided media is served by a client of the
backing store that maintains a mapping from block IDs to a corresponding identifier in the
backing store. Examples include file regions and object identifiers. By participating as a
storage tier, a client may use other features of HDFS (e.g., storage-level quotas, security)
to manage the local media as a read/write cache of provided blocks. By using local media,
we hope to not only improve performance for Apache Hadoop applications working with remote
storage, but also to maintain HDFS semantics _by using HDFS_ as the storage platform.

This generalizes existing work (HDFS-5318) supporting “shared, read-only” data in the
namespace in HDFS. Currently, every Datanode will report a (redundant) replica from shared,
read-only storage. This preserves two assumptions in the Namenode, first that every block
storage is attached to only one Datanode, and second that every replica location will be tracked
and reported to the Namenode. The Datanode implementation was not contributed to Apache, 
but presumably one would group replicas and report a subset from each Datanode to avoid reporting
the entire cluster as a location for every shared block.

In contrast, provided storage does not produce reports of every block accessible through it.
Each Datanode registers a consistent storage ID with the Namenode, which is configured to
refresh both the block mappings and- where applicable- the corresponding namespace.

We're working on a design document, where we will expand on the motivation, use cases, and
implementation changes we propose to make in HDFS.

> Allow HDFS block replicas to be provided by an external storage system
> ----------------------------------------------------------------------
>                 Key: HDFS-9806
>                 URL: https://issues.apache.org/jira/browse/HDFS-9806
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Chris Douglas
> In addition to heterogeneous media, many applications work with heterogeneous storage
systems. The guarantees and semantics provided by these systems are often similar, but not
identical to those of [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
Any client accessing multiple storage systems is responsible for reasoning about each system
independently, and must propagate/and renew credentials for each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to immutable
file regions, opaque IDs, or other tokens that represent a consistent view of the data. While
correctness for arbitrary operations requires careful coordination between stores, in practice
we can provide workable semantics with weaker guarantees.

This message was sent by Atlassian JIRA

View raw message