hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6198) FileSystem filtering should work on FileStatus rather than Path objects
Date Tue, 18 Aug 2009 11:02:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744454#action_12744454
] 

Chris Douglas commented on HADOOP-6198:
---------------------------------------

Is the proposal to retain two versions of globStatus, one taking a PathFilter and another
taking e.g. StatusFilter? This seems like a premature optimization, given that we have no
example of a FileSystem for which there is a significant performance difference, or any performance
improvement. If one wants to implement a FileSystem that only supports filtering by Path,
or that optimizes that particular call, then there are plenty of other ways to effect this
without adding a separate method to the core API. Your point on possible performance gain
is taken, but it doesn't seem like the right tradeoff, particularly without a concrete use
case.

We'll have to keep PathFilter around for at least another release for backwards-compatibility,
but I'd like to deprecate it.

> FileSystem filtering should work on FileStatus rather than Path objects
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-6198
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6198
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Chris Douglas
>
> There's an avoidable overhead in listing/globbing items with some property (e.g. owned
by user foo, only files, files larger than _n_ bytes, etc.). Internally, the Path is extracted
from a FileStatus object and passed to the PathFilter; simply passing the FileStatus object
would allow one to filter on the information in the status object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message