hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-9981) listing in RawLocalFileSystem is inefficient
Date Thu, 19 Sep 2013 21:39:53 GMT
Kihwal Lee created HADOOP-9981:

             Summary: listing in RawLocalFileSystem is inefficient
                 Key: HADOOP-9981
                 URL: https://issues.apache.org/jira/browse/HADOOP-9981
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 2.3.0
            Reporter: Kihwal Lee
            Priority: Critical

After HADOOP-9652, listStatus() or globStatus() calls against a local file system directory
is very slow.  A user was loading data from local file system to Hive and it took about 30
seconds. The same operation took less than a second pre-HADOOP-9652. 

The input path had many other files beside the input files and strace showed that fork &
exec of stat against each and every one of them. jstack confirmed that this was being done
from getNativeFileLinkStatus().

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message