hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7647) DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs
Date Fri, 23 Jan 2015 18:58:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289725#comment-14289725
] 

Arpit Agarwal commented on HDFS-7647:
-------------------------------------

Thanks for the patch [~milandesai], this looks like a good change.

Couple of comments:
# {{LocatedBlocks.getStorageTypes}} and {{.getStorageIDs}} should cache the generated arrays
on first invocation since existing callers expect these calls to be cheap. Except for the
sorting code the content of {{locs}} is not modified once the object is initialized.
# The sorting code must invalidate the cached arrays from 1.
# We should add a unit test for sortLocatedBlocks specifically for the invalidation.
# Also it would be good to add a comment to {{LocatedBlocks}} stating the assumption that
{{locs}} must not be modified by the caller, with the exception of {{sortLocatedBlocks}}.

In a separate Jira it would be good to make {{locs}} an unmodifiable list or a Guava {{ImmutableList}}.
The source of the issue is that an external function reaches into the LocatedBlock object
and modifies its private fields. It doesn't help that Java lacks support for C++-style const
arrays.

> DatanodeManager.sortLocatedBlocks() sorts DatanodeInfos but not StorageIDs
> --------------------------------------------------------------------------
>
>                 Key: HDFS-7647
>                 URL: https://issues.apache.org/jira/browse/HDFS-7647
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Milan Desai
>            Assignee: Milan Desai
>         Attachments: HDFS-7647-2.patch, HDFS-7647.patch
>
>
> DatanodeManager.sortLocatedBlocks() sorts the array of DatanodeInfos inside each LocatedBlock,
but does not touch the array of StorageIDs and StorageTypes. As a result, the DatanodeInfos
and StorageIDs/StorageTypes are mismatched. The method is called by FSNamesystem.getBlockLocations(),
so the client will not know which StorageID/Type corresponds to which DatanodeInfo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message