hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Sirianni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5484) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
Date Fri, 13 Dec 2013 15:53:07 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847588#comment-13847588
] 

Eric Sirianni commented on HDFS-5484:
-------------------------------------

This fix was basically nullified by the following change made via HDFS-5542

{code}
+  DatanodeStorageInfo updateStorage(DatanodeStorage s) {
     synchronized (storageMap) {
       DatanodeStorageInfo storage = storageMap.get(s.getStorageID());
       if (storage == null) {
@@ -670,8 +658,6 @@
                  " for DN " + getXferAddr());
         storage = new DatanodeStorageInfo(this, s);
         storageMap.put(s.getStorageID(), storage);
-      } else {
-        storage.setState(s.getState());
       }
       return storage;
     }
{code}

Is there a reason that 'else' was removed?  By no longer updating the state in the {{BlockReport}}
processing path, we effectively get the bogus state & type that is set via the first heartbeat
(see the fix for HDFS-5455):

{code}
+      if (storage == null) {
+        // This is seen during cluster initialization when the heartbeat
+        // is received before the initial block reports from each storage.
+        storage = updateStorage(new DatanodeStorage(report.getStorageID()));
{code}

Even reverting the change and reintroducing the 'else' leaves the state & type temporarily
inaccurate until the first block report.  Wouldn't a better fix be to simply include the full
{{DatanodeStorage}} object in the {{StorageReport}} (as opposed to only the Storage ID)?

As a matter of bookkeeping, should I reopen this JIRA, or would you prefer a new one be created?

> StorageType and State in DatanodeStorageInfo in NameNode is not accurate
> ------------------------------------------------------------------------
>
>                 Key: HDFS-5484
>                 URL: https://issues.apache.org/jira/browse/HDFS-5484
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: Heterogeneous Storage (HDFS-2832)
>            Reporter: Eric Sirianni
>             Fix For: Heterogeneous Storage (HDFS-2832)
>
>         Attachments: HDFS-5484-HDFS-2832--2.patch, HDFS-5484-HDFS-2832.patch
>
>
> The fields in DatanodeStorageInfo are updated from two distinct paths:
> # block reports
> # storage reports (via heartbeats)
> The {{state}} and {{storageType}} fields are updated via the Block Report.  However,
as seen in the code blow, these fields are populated from a "dummy" {{DatanodeStorage}} object
constructed in the DataNode:
> {code}
> BPServiceActor.blockReport() {
> //...
>         // Dummy DatanodeStorage object just for sending the block report.
>         DatanodeStorage dnStorage = new DatanodeStorage(storageID);
> //...
> }
> {code}
> The net effect is that the {{state}} and {{storageType}} fields are always the default
of {{NORMAL}} and {{DISK}} in the NameNode.
> The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from:
> {code}
> public Map<String, BlockListAsLongs> getBlockReports(String bpid);
> {code}
> to:
> {code}
> public Map<DatanodeStorage, BlockListAsLongs> getBlockReports(String bpid);
> {code}
> thereby allowing {{BPServiceActor}} to send the "real" {{DatanodeStorage}} object with
the block report.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message