hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15547) WASB: improve listStatus performance
Date Thu, 19 Jul 2018 19:57:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16549767#comment-16549767
] 

Hudson commented on HADOOP-15547:
---------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14594 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14594/])
HADOOP-15547/ WASB: improve listStatus performance. Contributed by (stevel: rev 45d9568aaaf532a6da11bd7c1844ff81bf66bab1)
* (edit) hadoop-tools/hadoop-azure/pom.xml
* (delete) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/PartialListing.java
* (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/AzureNativeFileSystemStore.java
* (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/FileMetadata.java
* (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeFileSystemStore.java
* (edit) hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azure/NativeAzureFileSystem.java
* (edit) hadoop-tools/hadoop-azure/dev-support/findbugs-exclude.xml
* (add) hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azure/ITestListPerformance.java


> WASB: improve listStatus performance
> ------------------------------------
>
>                 Key: HADOOP-15547
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15547
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/azure
>    Affects Versions: 2.9.1, 3.0.2
>            Reporter: Thomas Marquardt
>            Assignee: Thomas Marquardt
>            Priority: Major
>         Attachments: HADOOP-15547-004.patch, HADOOP-15547-004.patch, HADOOP-15547.001.patch,
HADOOP-15547.002.patch, HADOOP-15547.003.patch
>
>
> The WASB implementation of Filesystem.listStatus is very slow due to O(n!) algorithm
to remove duplicates and uses too much memory due to the extra conversion from BlobListItem
to FileMetadata to FileStatus.  It takes over 30 minutes to list 700,000 files.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message