hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-2566) need FileSystem#globStatus method
Date Fri, 11 Jan 2008 19:38:34 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558081#action_12558081
] 

rangadi edited comment on HADOOP-2566 at 1/11/08 11:36 AM:
----------------------------------------------------------------

globStatus would certainly be useful since globPaths() is used in many places where we really
want to do globStatus(). globStatus is much more efficient in those cases since we aften do
'{{for(path : globPaths(pattern)) { stat = listStatus(path) ... }}}'.

I am not sure if globPaths() can go away. One difference I see is that globPath("/non/existent/path/withoutglob")
returns simple path without any filesystem interaction (as expected). But globStatus("/non/existent/path/withoutglob")
 will ask filesystem and will return NULL (or array with zero entries).


      was (Author: rangadi):
    globStatus would certainly be useful since globPaths() is used in many places where we
really want to do globStatus(). globStatus is much more efficient in those cases since we
aften do {{for(path : globPaths(pattern)) { stat = listStatus(path) ... }.

I am not sure if globPaths() can go away. One difference I see is that globPath("/non/existent/path/withoutglob")
returns simple path without any filesystem interaction (as expected). But globStatus("/non/existent/path/withoutglob")
 will ask filesystem and will return NULL (or array with zero entries).

  
> need FileSystem#globStatus method
> ---------------------------------
>
>                 Key: HADOOP-2566
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2566
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Doug Cutting
>            Assignee: Hairong Kuang
>             Fix For: 0.16.0
>
>
> To remove the cache of FileStatus in DFSPath (HADOOP-2565) without hurting performance,
we must use file enumeration APIs that return FileStatus[] rather than Path[].  Currently
we have FileSystem#globPaths(), but that method should be deprecated and replaced with a FileSystem#globStatus().
> We need to deprecate FileSystem#globPaths() in 0.16 in order to remove the cache in 0.17.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message