hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6870) Add FileSystem#listLocatedStatus to list a directory's content together with each file's block locations
Date Thu, 29 Jul 2010 18:39:17 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893760#action_12893760
] 

Suresh Srinivas commented on HADOOP-6870:
-----------------------------------------

Sorry for posting the comments late. I was busy.

# General comment: I have concerns about recursive listing. This could be abused by the applications,
creating a lot of requests into HDFS. 
# Any deletion of files/directories while reursing through directories results in RuntimeException
and application has a partial result. Should we ignore if a directory was in {{stack}} and
was not found later when iterating through it?
# FileSystem.java
#* listFile() - method javadoc could be better organized - first write about if path is directory
and two cases recursive=true and false. Then if path is file and two cases recursive=true
or false.
#* listFile() - document throwing RuntimeException, UnsupportedOperationException and the
possible cause. IOException is no longer thrown.
# TestListFiles.java
#* testDirectory() - comments {{test empty directory}} and  {{test directory with 1 file}}
should be moved up to relevant sections of the test. 



> Add FileSystem#listLocatedStatus to list a directory's content together with each file's
block locations
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6870
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6870
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 0.22.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.22.0
>
>         Attachments: listFiles.patch, listFiles1.patch, listFiles2.patch, listFiles3.patch,
listFiles4.patch
>
>
> This jira implements the new FileSystem API as proposed in HDFS-202. The new API aims
to eliminate individual "getFileBlockLocations" calls to NN for each file in the input directory
of a job. Instead, a file's block locations are returned together with FileStatus when listing
a directory, thus improving getSplits performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message