hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yi Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7174) Support for more efficient large directories
Date Fri, 03 Oct 2014 05:06:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157676#comment-14157676

Yi Liu commented on HDFS-7174:

[~kihwal], I like the idea. It improve performance of add/remove for large directory, but
almost doesn't increase memory usage (has some increment when switching to use {{INodeHashedArrayList}},
explain as following) and search operation performance is not decreased.
* 1024 size list table needs 1024 * 4 = 4K size memory to store references (assume reference
is 4 bytes) of ArrayLists, and originally we only have 1 ArrayList, but now the number is
1024 ArrayLists, and in Java, Class has overhead, suppose it's 12 bytes, so total increased
Class overhead is ~1024*12 = 12K. In conclusion there are 16K memory increased for this FSDirectory.

Currently the children of INodeDirectory are stored as sorted array list. The time complexity:
Add: O(log n) + resize the list and array copy.
Delete: O(log n) + array copy.
Search: O(log n)
When the list gets bigger,  _resize list (allocate new memory) and array copy_ becomes performance
Your idea is good to solve this issue.

The biggest performance hit is due to the way insertion is done in ArrayList. An insertion
requires copying of average of 1/2 n entries. Also as the list gets bigger, a new one with
a bigger size needs to be allocated, copied and partially copied again. For big directories,
we are talking megabytes. Far better performance can be obtained by using different data structures,
but I believe ArrayList (originally primitive array) was chosen to minimize the memory usage.

I wonder, is it worthwhile to try to store the inodeid as an long[] directly?
As a reference, for c++ on modern machines inserting elements in std::vector is consistently
faster than std::list
ArrayList is also efficient for memory, it stores array of references, just have a Class overhead
(Maybe 12 bytes?).
std::vector is similar with ArrayList in Java (Seems ArrayList and vector in java just have
difference in synchronization and  growth)

I have few comments:
*1.*  I see you choose the threshold 50000, and from the performance grid, {{ArrayList}} and
{{Auto Switching}} are close at this point.  I suggest we choose a bit bigger value, since
we need to consider the memory usage, as explained in #1
*2.* Following code
  public void add(int index, INode node) {
    getList().add(index, node);
We can check the size firstly and then do add, it will be more efficient, same as delete
*3.* We also need to consider the behavior when reach the threshold. There may be add/remove
alternately, and cause switch the list frequently?  Maybe we can define separate thresholds
for add/remove, and remove threshold is a bit smaller.

> Support for more efficient large directories
> --------------------------------------------
>                 Key: HDFS-7174
>                 URL: https://issues.apache.org/jira/browse/HDFS-7174
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>         Attachments: HDFS-7174.new.patch, HDFS-7174.patch, HDFS-7174.patch
> When the number of children under a directory grows very large, insertion becomes very
costly.  E.g. creating 1M entries takes 10s of minutes.  This is because the complexity of
an insertion is O\(n\). As the size of a list grows, the overhead grows n^2. (integral of
linear function).  It also causes allocations and copies of big arrays.

This message was sent by Atlassian JIRA

View raw message