hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5454) DataNode UUID should be assigned prior to FsDataset initialization
Date Mon, 04 Nov 2013 19:02:17 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13813117#comment-13813117

Arpit Agarwal commented on HDFS-5454:

The javadoc for {{DataNode#initStorage}} does not appear to match the code.

   * Initializes the {@link #data}. The initialization is done only once, when
   * handshake with the the first namenode is completed.
  private void initStorage(final NamespaceInfo nsInfo) throws IOException {

However initStorage is invoked before the handshake completes, in {{BPServiceActor#connectToNNAndHandshake}}

    // Verify that this matches the other NN in this HA pair.
    // This also initializes our block pool in the DN if we are
    // the first NN connection for this BP.
    bpos.verifyAndSetNamespaceInfo(nsInfo);    <<<--- Calls initStorage.
    // Second phase of the handshake with the NN.

I am not sure if we need to reorder the calls. Would need to look at this further.

> DataNode UUID should be assigned prior to FsDataset initialization
> ------------------------------------------------------------------
>                 Key: HDFS-5454
>                 URL: https://issues.apache.org/jira/browse/HDFS-5454
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: Heterogeneous Storage (HDFS-2832)
>            Reporter: Eric Sirianni
>            Priority: Minor
> The DataNode's UUID ({{DataStorage.getDatanodeUuid()}} field) is NULL at the point where
the {{FsDataset}} object is created ({{DataNode.initStorage()}}.  
> As the {{DataStorage}} object is an input to the {{FsDataset}} factory method, it is
desirable for it to be fully populated with a UUID at this point.  In particular, our {{FsDatasetSpi}}
implementation relies upon the DataNode UUID as a key to access our underlying block storage
> This also appears to be a regression compared to Hadoop 1.x - our 1.x {{FSDatasetInterface}}
plugin has a non-NULL UUID on startup.  I haven't fully traced through the code, but I suspect
this came from the {{BPOfferService}}/{{BPServiceActor}} refactoring to support federated
> With HDFS-5448, the DataNode is now responsible for generating its own UUID.  This greatly
simplifies the fix.  Move the UUID check/generation in from {{DataNode.createBPRegistration()}}
to {{DataNode.initStorage()}}.  This more naturally co-locates UUID generation immediately
subsequent to the read of the UUID from the {{DataStorage}} properties file.
> {code}
>   private void initStorage(final NamespaceInfo nsInfo) throws IOException {
>     // ...
>       final String bpid = nsInfo.getBlockPoolID();
>       //read storage info, lock data dirs and transition fs state if necessary
>       storage.recoverTransitionRead(this, bpid, nsInfo, dataDirs, startOpt);
>       checkDatanodeUuid();
>     // ...
>   }
> {code}

This message was sent by Atlassian JIRA

View raw message