hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-14086) Improve DistCp Speed for small files
Date Wed, 15 Feb 2017 21:55:41 GMT
Zheng Shao created HADOOP-14086:

             Summary: Improve DistCp Speed for small files
                 Key: HADOOP-14086
                 URL: https://issues.apache.org/jira/browse/HADOOP-14086
             Project: Hadoop Common
          Issue Type: Improvement
          Components: tools/distcp
    Affects Versions: 2.6.5
            Reporter: Zheng Shao
            Assignee: Zheng Shao
            Priority: Minor

When using distcp to copy lots of small files,  NameNode naturally becomes a bottleneck.

The current distcp code did *not* optimize to reduce the NameNode calls.  We should restructure
the code to reduce the number of NameNode calls as much as possible to speed up the copy of
small files.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message