pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prashant Kommireddi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-3223) AvroStorage does not handle comma separated input paths
Date Sat, 23 Mar 2013 01:15:15 GMT

    [ https://issues.apache.org/jira/browse/PIG-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13611505#comment-13611505
] 

Prashant Kommireddi commented on PIG-3223:
------------------------------------------

Thanks [~dreambird]. 

I have a question regarding the current approach - why isn't the globbing implemented in PigAvroInputFormat?
Overriding listStatus(JobContext job) should be cleaner, unless I am missing something very
specific to Avro?
                
> AvroStorage does not handle comma separated input paths
> -------------------------------------------------------
>
>                 Key: PIG-3223
>                 URL: https://issues.apache.org/jira/browse/PIG-3223
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Michael Kramer
>            Assignee: Johnny Zhang
>         Attachments: AvroStorage.patch, AvroStorage.patch-2, AvroStorageUtils.patch,
AvroStorageUtils.patch-2, PIG-3223.patch.txt, PIG-3223.patch.txt
>
>
> In pig 0.11, a patch was issued to AvroStorage to support globs and comma separated input
paths (PIG-2492).  While this function works fine for glob-formatted input paths, it fails
when issued a standard comma separated list of paths.  fs.globStatus does not seem to be able
to parse out such a list, and a java.net.URISyntaxException is thrown when toURI is called
on the path.  
> I have a working fix for this, but it's extremely ugly (basically checking if the string
of input paths is globbed, otherwise splitting on ",").  I'm sure there's a more elegant solution.
 I'd be happy to post the relevant methods and "fixes" if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message