hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
Date Tue, 05 Nov 2013 01:49:22 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813539#comment-13813539

Arpit Agarwal commented on HDFS-2832:

I thought that the storageID approach that each DN is able to generate a unique id independently
of the others is a good feature to retain.
Storage (UU)IDs are independently generated on the Datanode in {{DataStorage#format}}.

UUID as you noted is not unique and needs to be coordinated through NameNode.
Not true. {{UUID#randomUUID}} generates RFC-4122 compliant UUIDs which are unique for all
practical purposes without NameNode coordination.

You can also add to storageID an attribute that characterizes the disk volume or the directory
as a new component. Examples of the new attribute could be disk serial number, or the storage
directory inode number. It seems that introduction of UUIDs was unnecessary, unless of course
I missed some context.
Part of the rationale is in HDFS-5115. Making them UUIDs simplifies the generation logic.
Decoupling them from volume/directory characteristics allows future storage media that do
not have a disk serial number or inode number.

> Enable support for heterogeneous storages in HDFS
> -------------------------------------------------
>                 Key: HDFS-2832
>                 URL: https://issues.apache.org/jira/browse/HDFS-2832
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 0.24.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: 20130813-HeterogeneousStorage.pdf, h2832_20131023.patch, h2832_20131023b.patch,
h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch,
> HDFS currently supports configuration where storages are a list of directories. Typically
each of these directories correspond to a volume with its own file system. All these directories
are homogeneous and therefore identified as a single storage at the namenode. I propose, change
to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages.

This message was sent by Atlassian JIRA

View raw message