hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahad Rana (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4483) getBlockArray in DatanodeDescriptor does not honor passed in maxblocks value
Date Wed, 22 Oct 2008 00:14:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641674#action_12641674

Ahad Rana commented on HADOOP-4483:

Re: That is wrong. clear() removes all elements. You only have to set root = null,

You are right. I see that TreeSet uses TreeMap as the underlying container, and, each iterator.remove
causes a re-balance of the red-black tree. And, yes, looks like TreeSet.clear() set root =
null, which is obviously speedier. I would still argue that under normal load scenarios (from
my initial observations), the number of blocks in the Collection are <= 10, and since delete
on the CLRS implementation of the Red-Black Tree takes O(log n), there might not be that much
gained from the .clear() optimization. In the edge case where the system is under heavy load,
I observed block count in excess of 1000. In this case, a partial removal of the blocks (based
on the max blocks limitation) would still require the iterator.remove pattern. So perhaps
in the long term, it might be better to replace the underlying data structure, as Hairong
suggests. I guess it would be interesting to find out if the code (post patch) is a performance
bottle neck or not before undertaking the more aggressive modifications.    

> getBlockArray in DatanodeDescriptor does not honor passed in maxblocks value
> ----------------------------------------------------------------------------
>                 Key: HADOOP-4483
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4483
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.1
>         Environment: hadoop-0.18.1 running on a cluster of 16 nodes.
>            Reporter: Ahad Rana
>            Priority: Critical
>             Fix For: 0.18.2
>         Attachments: patch.HADOOP-4483
>   Original Estimate: 1h
>  Remaining Estimate: 1h
> The getBlockArray method in DatanodeDescriptor.java should honor the passed in maxblocks
parameter. In its current form it passed in an array sized to min(maxblocks,blocks.size())
into the Collections.toArray method. As the javadoc for Collections.toArray indicates, the
toArray method may discard the passed in array (and allocate a new array) if the number of
elements returned by the iterator exceeds the size of the passed in array. As a result, the
flawed implementation of this method would return all the invalid blocks for a data node in
one go, and thus trigger the NameNode to send a DNA_INVALIDATE command to the DataNode with
an excessively large number of blocks. This INVALIDATE command, in turn, could potentially
take a very long time to process at the DataNode, and since DatanodeCommand(s) are processed
in between heartbeats at the DataNode, this would trigger the NameNode to consider the DataNode
to be offline / unresponsive (due to a lack of heartbeats). 
> In our use-case at CommonCrawl.org, we regularly do large scale hdfs file deletions after
certain stages of our map-reduce pipeline. These deletes would make certain DataNode(s) unresponsive,
and thus impact the cluster's capability to properly balance file-system reads / writes across
the whole available cluster. This problem only surfaced once we migrated from our 16.2 deployment
to the current 18.1 release. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message