pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Coveney (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2614) AvroStorage crashes on LOADING a single bad error
Date Mon, 02 Apr 2012 22:25:22 GMT

    [ https://issues.apache.org/jira/browse/PIG-2614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244752#comment-13244752
] 

Jonathan Coveney commented on PIG-2614:
---------------------------------------

Daniel,

The files you are referencing are from https://issues.apache.org/jira/browse/PIG-2551. I used
PigCounterHelper. Those files will also be checked in with the JRuby patch (or we can just
commit that one already, your call). I included the whole thing to be unambiguous, but can
remove the other files.

Russell,

Perhaps an e2e test is better here? Is it that you can't get the function to work, or that
you can't get data to work it on?
                
> AvroStorage crashes on LOADING a single bad error
> -------------------------------------------------
>
>                 Key: PIG-2614
>                 URL: https://issues.apache.org/jira/browse/PIG-2614
>             Project: Pig
>          Issue Type: Bug
>          Components: piggybank
>    Affects Versions: 0.10, 0.11
>            Reporter: Russell Jurney
>              Labels: avro, avrostorage, bad, book, cutting, doug, for, my, pig, sadism
>             Fix For: 0.10, 0.11
>
>         Attachments: PIG-2614_0.patch, PIG-2614_1.patch
>
>
> AvroStorage dies when a single bad record exists, such as one with missing fields.  This
is very bad on 'big data,' where bad records are inevitable.  See discussion at http://www.quora.com/Big-Data/In-Big-Data-ETL-how-many-records-are-an-acceptable-loss
for more theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message