pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-2909) Add a new option for ignoring corrupted files to AvroStorage load func
Date Thu, 13 Sep 2012 06:07:07 GMT

     [ https://issues.apache.org/jira/browse/PIG-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Cheolsoo Park updated PIG-2909:

    Attachment: PIG-2909-2.patch
> Add a new option for ignoring corrupted files to AvroStorage load func
> ----------------------------------------------------------------------
>                 Key: PIG-2909
>                 URL: https://issues.apache.org/jira/browse/PIG-2909
>             Project: Pig
>          Issue Type: Improvement
>          Components: piggybank
>    Affects Versions: 0.10.0
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>         Attachments: PIG-2909-2.patch, PIG-2909-avro_test_files.tar.gz, PIG-2909.patch
> Currently, AvroStorage load fails with AvroRuntimeException when encountering corrupted
input files. For example,
> {code}
> ERROR 2997: Unable to recreate exception from backed error: java.io.IOException: org.apache.avro.AvroRuntimeException:
java.io.IOException: Invalid sync!
> 	at org.apache.pig.piggybank.storage.avro.AvroStorage.getNext(AvroStorage.java:283)
> {code}
> But it is not always desirable to fail the Pig job for bad files. It is sometimes more
useful to skip them and continue.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message