pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prashant Kommireddi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-3223) AvroStorage does not handle comma separated input paths
Date Sat, 23 Mar 2013 01:15:15 GMT

    [ https://issues.apache.org/jira/browse/PIG-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611505#comment-13611505

Prashant Kommireddi commented on PIG-3223:

Thanks [~dreambird]. 

I have a question regarding the current approach - why isn't the globbing implemented in PigAvroInputFormat?
Overriding listStatus(JobContext job) should be cleaner, unless I am missing something very
specific to Avro?
> AvroStorage does not handle comma separated input paths
> -------------------------------------------------------
>                 Key: PIG-3223
>                 URL: https://issues.apache.org/jira/browse/PIG-3223
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Michael Kramer
>            Assignee: Johnny Zhang
>         Attachments: AvroStorage.patch, AvroStorage.patch-2, AvroStorageUtils.patch,
AvroStorageUtils.patch-2, PIG-3223.patch.txt, PIG-3223.patch.txt
> In pig 0.11, a patch was issued to AvroStorage to support globs and comma separated input
paths (PIG-2492).  While this function works fine for glob-formatted input paths, it fails
when issued a standard comma separated list of paths.  fs.globStatus does not seem to be able
to parse out such a list, and a java.net.URISyntaxException is thrown when toURI is called
on the path.  
> I have a working fix for this, but it's extremely ugly (basically checking if the string
of input paths is globbed, otherwise splitting on ",").  I'm sure there's a more elegant solution.
 I'd be happy to post the relevant methods and "fixes" if necessary.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message