hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2476) More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks
Date Fri, 21 Oct 2011 06:30:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132364#comment-13132364
] 

jiraposter@reviews.apache.org commented on HDFS-2476:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2515/#review2738
-----------------------------------------------------------

Ship it!


Looks very good. See the small nits below. Are there any non-trivial changes in this patch
compared to what you're running in production?


trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
<https://reviews.apache.org/r/2515/#comment6169>

    This javadoc describes a linked hashset, but the implementation is just a normal hashset.
Copy-paste error?



trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
<https://reviews.apache.org/r/2515/#comment6167>

    can hashCode and element be marked final?



trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
<https://reviews.apache.org/r/2515/#comment6168>

    formatting is a little off - should go on same line



trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
<https://reviews.apache.org/r/2515/#comment6170>

    I find this javadoc a little misleading - it implies that there's a linked list of elements
separate from the hashtable chains. The removal order is in hash-bucket order rather than
insertion order. "Maybe something like: removes N entries from the hashtable. The order in
which entries are removed is unspecified, and may not correspond to the order in which they
were inserted."


- Todd


On 2011-10-20 22:58:45, Tomasz Nykiel wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2515/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-20 22:58:45)
bq.  
bq.  
bq.  Review request for Hairong Kuang.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This patch introduces two hash data structures for storing under-replicated, over-replicated
and invalidated blocks.
bq.  
bq.  1. LightWeightHashSet
bq.  2. LightWeightLinkedSet
bq.  
bq.  Currently in all these cases we are using java.util.TreeSet which adds unnecessary overhead.
bq.  
bq.  The main bottlenecks addressed by this patch are:
bq.  -cluster instability times, when these queues (especially under-replicated) tend to grow
quite drastically,
bq.  -initial cluster startup, when the queues are initialized, after leaving safemode,
bq.  -block reports,
bq.  -explicit acks for block addition and deletion
bq.  
bq.  1. The introduced structures are CPU-optimized.
bq.  2. They shrink and expand according to current capacity.
bq.  3. Add/contains/delete ops are performed in O(1) time (unlike current log n for TreeSet).
bq.  4. The sets are equipped with fast access methods for polling a number of elements (get+remove),
which are used for handling the queues.
bq.  
bq.  
bq.  This addresses bug HDFS-2476.
bq.      https://issues.apache.org/jira/browse/HDFS-2476
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/InvalidateBlocks.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/UnderReplicatedBlocks.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSck.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightHashSet.java
PRE-CREATION 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/util/LightWeightLinkedSet.java
PRE-CREATION 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestListCorruptFileBlocks.java
1187124 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightHashSet.java
PRE-CREATION 
bq.    trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/util/TestLightWeightLinkedSet.java
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2515/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Provided JUnit tests.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Tomasz
bq.  
bq.


                
> More CPU efficient data structure for under-replicated/over-replicated/invalidate blocks
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-2476
>                 URL: https://issues.apache.org/jira/browse/HDFS-2476
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: Tomasz Nykiel
>            Assignee: Tomasz Nykiel
>         Attachments: hashStructures.patch, hashStructures.patch-2
>
>
> This patch introduces two hash data structures for storing under-replicated, over-replicated
and invalidated blocks. 
> 1. LightWeightHashSet
> 2. LightWeightLinkedSet
> Currently in all these cases we are using java.util.TreeSet which adds unnecessary overhead.
> The main bottlenecks addressed by this patch are:
> -cluster instability times, when these queues (especially under-replicated) tend to grow
quite drastically,
> -initial cluster startup, when the queues are initialized, after leaving safemode,
> -block reports,
> -explicit acks for block addition and deletion
> 1. The introduced structures are CPU-optimized.
> 2. They shrink and expand according to current capacity.
> 3. Add/contains/delete ops are performed in O(1) time (unlike current log n for TreeSet).
> 4. The sets are equipped with fast access methods for polling a number of elements (get+remove),
which are used for handling the queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message