pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-2195) AvroStorage fails to STORE when LOADing via PigStorage
Date Mon, 15 Aug 2011 23:25:27 GMT

     [ https://issues.apache.org/jira/browse/PIG-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates updated PIG-2195:
----------------------------

       Resolution: Fixed
    Fix Version/s: 0.10
           Status: Resolved  (was: Patch Available)

Patch checked in.  Thanks Bill.

> AvroStorage fails to STORE when LOADing via PigStorage
> ------------------------------------------------------
>
>                 Key: PIG-2195
>                 URL: https://issues.apache.org/jira/browse/PIG-2195
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Bill Graham
>            Assignee: Bill Graham
>             Fix For: 0.10
>
>         Attachments: PIG-2195_1.patch, expected_testRecordSplitFromText1.avro, expected_testRecordSplitFromText2.avro
>
>
> Reading data via {{PigStorage}} and writing it via {{AvroStorage}} fails with an exception
like this
> {{java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.avro.generic.IndexedRecord}}
> The Pig script in this section of the documentation shows an example like this that fails:
> http://linkedin.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data#AvroStorage-PigsupportforAvrodata-A.Howtostoredataindifferentways.
> A workaround currently exists to produce avro from TSVs like this:
> {noformat}
> avro = LOAD 'inputPath/' AS (foo);
> STORE avro INTO 'outputPath/' USING oap.piggybank.storage.avro.AvroStorage(
>   '{"data":"data_file.avro",
>     "same":"data_file.avro", "field0":"def:bar"}');
> {noformat}
> This is redundant though and {{data}} and {{same}} seem to indicate the same thing. This
approach also requires an existing avro data file to exist. This patch will make the following
alternate constructor syntax's work as well.
> # Read schema from an existing data file:
> {noformat}
>   '{"data":"data_file.avro", "field0":"def:bar"}');
> {noformat}
> # Read schema from an existing schema file:
> {noformat}
>   '{"schema_file":"data_file.avsc", "field0":"def:bar"}');
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message