hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tanping Wang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-1365) HDFS federation: propose ClusterID and BlockPoolID format
Date Sat, 18 Sep 2010 00:30:36 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tanping Wang updated HDFS-1365:
-------------------------------

    Attachment: HDFS1365-branch1052-3.patch

Thanks, Biros!.  This patch addresss review comments from Biros,

1.LOG in TestStorageInfo is not used
Yes, removed.

2. Lots of code (like TestStorageInfo, UpgradeUtilities and others are twice in the patch
Regenerated a clean patch. The first patch was messed up due to svn merge from trunk.

3. newBlockPoolID - if we cannot get new IP , should we use "unknownIP" or throw an exception?

The current implementation,

    String ip = "unknownIP";
    try {
      ip = DNS.getDefaultIP("default");
    } catch (UnknownHostException ignored) {
      LOG.warn("Could not find ip address of \"default\" inteface.");
    }
    ...
    this.blockpoolID ="BP-" + rand + "-"+ ip + "-" + System.currentTimeMillis();
    return this.blockpoolID;

is to add in unknownIP as part of the blockpoolID.

The question here is if IP address is not returned for some reason, should we allow new blockpoolID
to be generated and go head formatting a new block pool or this behavior should be forbidden.
My answer is that I think we should generate blockpoolID with unknownIP. BlockPoolID is the
unique identifier of a block pool. Its uniqueness is essentially important.  We identify a
block pool by its blockpoolID regardless of its DN IP address. We include DN IP address there
to reduce the chances of generating duplicated blockpoolD.  With IP or with no IP, as long
as its unique, it is a good blockpoolID.  Since timestamps is also a part of the blockpooID,
most likely, blockpoolID would still be different even if unknown IP is returned for multiple
data nodes.

4. we should not have spaces in cluster id. (cid - otherstuff)
Yes, removed.

5. can we do without if (sv == null || st == null || sid == null || scid == null || sbpid
== null + || sct == null) {
in Storage.java:getFields()? Also please remove commented out lines there..

I removed the extra lines there.  We need to have the check there as it does a validation
check of the VERSION file.  This is a valid validation and it's also part of the original
code.
Now the code looks like this,

    if (sv == null || st == null || sid == null || scid == null || sbpid == null
        || sct == null) {
      throw new InconsistentFSStateException(sd.root,
        "file " + STORAGE_FILE_VERSION + " is invalid.");
    }

> HDFS federation: propose ClusterID and BlockPoolID format
> ---------------------------------------------------------
>
>                 Key: HDFS-1365
>                 URL: https://issues.apache.org/jira/browse/HDFS-1365
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tanping Wang
>            Assignee: Tanping Wang
>             Fix For: Federation Branch
>
>         Attachments: HDFS1365-branch-HDFS1052.1.patch, HDFS1365-branch1052-2.patch, HDFS1365-branch1052-3.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message