hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6198) FileSystem filtering should work on FileStatus rather than Path objects
Date Tue, 18 Aug 2009 07:19:14 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744392#action_12744392

Chris Douglas commented on HADOOP-6198:

Granted, but supporting two filtering interfaces will likely cause maintenance headaches and
be more of a burden to a FS implementor; I'd rather pick an API and not support all possible
variants. If a FileSystem is more efficient with path-based filtering, it can still work with
FileStatus objects, either populating them lazily, filling them with defaults (what many shims
do anyway), or even failing if a user queries unsupported data. Since globStatus returns FileStatus
objects, any implementation will need to construct them for the set of accepted Paths, anyway.
Given that the API seems biased toward FileStatus objects, I'd rather endure a penalty for
the hypothetical FS that doesn't return this information, rather than maintain two separate
filtering APIs.

> FileSystem filtering should work on FileStatus rather than Path objects
> -----------------------------------------------------------------------
>                 Key: HADOOP-6198
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6198
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Chris Douglas
> There's an avoidable overhead in listing/globbing items with some property (e.g. owned
by user foo, only files, files larger than _n_ bytes, etc.). Internally, the Path is extracted
from a FileStatus object and passed to the PathFilter; simply passing the FileStatus object
would allow one to filter on the information in the status object.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message