hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mariappan Asokan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2461) Support HDFS file name globbing in libhdfs
Date Tue, 18 Oct 2011 01:58:10 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129418#comment-13129418

Mariappan Asokan commented on HDFS-2461:

   Thanks for your comments.  I am aware of the methods in FileSystem class.  However, I wanted
the C API to be simpler.  Callers can iterate through the array and call hdfsGetPathInfo()
to get the equivalent of FileStatus object if they wish.  Also, the caller can pass each file
name to a filter function.

Having said that, the use cases of the API will dictate the function signature(simplicity
versus convenience.)  My current requirement is just to get file names matching wildcard patterns.
 I would like to hear the opinions from other developers before finalizing the API.

> Support HDFS file name globbing in libhdfs
> ------------------------------------------
>                 Key: HDFS-2461
>                 URL: https://issues.apache.org/jira/browse/HDFS-2461
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: libhdfs
>            Reporter: Mariappan Asokan
>            Priority: Minor
> This is to enhance the C API in libhdfs to support HDFS file name globbing.  The proposal
is to keep the new API simple and return a list of matched HDFS path names.  Callers can use
existing hdfsGetPathInfo() to get additional information on each of the matched path.  Following
code snippet shows the proposed API enhancements:
> {code:title=hdfs.h}
> /**
>  * hdfsGlob - Get all the HDFS file names that match a glob pattern.  The
>  * returned result will be sorted by the file names.  The last element in the
>  * array is NULL.  The function hdfsFreeGlob() should be called to free this
>  * array and its contents.
>  * @param fs The configured filesystem handle.
>  * @param globPattern The glob pattern to match file names against.  Note that
>  * this is not a POSIX regular expression but rather a POSIX glob pattern.
>  * @return Returns a dynamically-allocated array of strings; if there is no
>  * match, an array with one entry that has a NULL value will be returned.  If
>  * there is an error, NULL will be returned.
>  */
> char ** hdfsGlob(hdfsFS fs, const char *globPattern);
> /**
>  * hdfsFreeGlob - Free up the array returned by hdfsGlob().
>  * @param globResult The array of dynamically-allocated strings returned by
>  * hdfsGlob().
>  */
> void hdfsFreeGlob(char **globResult);
> {code}
> Please comment on the above proposed API.  I will start the implementation and testing.
 However, I need a committer to work with.
> Thanks.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message