hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei (Eddy) Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6673) Add Delimited format supports for PB OIV tool
Date Wed, 21 Jan 2015 00:13:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284696#comment-14284696

Lei (Eddy) Xu commented on HDFS-6673:

[~wheat9] To provide more background, I described what I had tried here:

1. I had tried use {{directory ID || inode Id}} as key and {{INode}} protobuf as value to
store all INodes in LevelDB, the end-to-end time is about 40-50 minutes, while the time to
dump INodes along is about 20-ish minutes, which is already larger than the end-to-end time
now (10 minutes). Moreover, when the LevelDB become larger (about 1GB as I recalled), the
write performance dropped significantly. I suspected that it is because the [write-amplification|https://github.com/facebook/rocksdb/wiki/RocksDB-Basics].
I have also tried to split one large LevelDB to multiple smaller ones, but it does not worth
the complexity. As a result, I dropped this approach and chose to not re-order inodes.

bq. This does not hold. FSImage stores the inodes with no order. See {{FSImageFormatPBINode#serializeINodeSection.}}

Yes, you are right.  But by checking {{INode#hashCode()}}, it seems that they are not completely
random when {{INode <= 2 ** 32}}. Despite of that, since {{dirChildMap}} uses {{Long}}
as keys and values. The size of {{dirChildMap}} is 2 orders of magnitude smaller than the
fsimage.  So if the fsimage is {{50GB}}, the leveldb is less than 1GB and can be reasonably
well to fit into OS cache on a laptop.  Thus one seek per INode is not terribly bad maybe?

3. The {{DirPathCache}} caches the *full path* of the parent directory with 16K entries. Suppose
the average full path of a directory is about 128 bytes, it uses only about ~1MB memory. I
supposed that we can increase the capacity of this LRUcache later when we actually measure
the hit rates. I believe that this LRUcache should work, given the fact that the measured
performance of this approach is faster.

4. Unlike in {{FileDistributionCalculator}}, we need the full path of an inode when print
it.  Since directories and inodes are stored out of order in fsimage, we need at least sorting
directories or inodes to some extend. I chose to sort directory, because 

# The total # of directories is much smaller.
# The LRU cache is more (only) effective to directories. 

Do these make sense to you, [~wheat9]. It would be great if I can get a +1 from you.


> Add Delimited format supports for PB OIV tool
> ---------------------------------------------
>                 Key: HDFS-6673
>                 URL: https://issues.apache.org/jira/browse/HDFS-6673
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 2.4.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>            Priority: Minor
>         Attachments: HDFS-6673.000.patch, HDFS-6673.001.patch, HDFS-6673.002.patch, HDFS-6673.003.patch,
HDFS-6673.004.patch, HDFS-6673.005.patch
> The new oiv tool, which is designed for Protobuf fsimage, lacks a few features supported
in the old {{oiv}} tool. 
> This task adds supports of _Delimited_ processor to the oiv tool. 

This message was sent by Atlassian JIRA

View raw message