avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruslan Al-Fakikh <metarus...@gmail.com>
Subject Re: STORE USING AvroStorage - ignores Pig field names, only using their position
Date Sun, 17 Nov 2013 03:16:09 GMT
Thanks, Russel!

Do you mean that this is the expected behavior? Shouldn't AvroStorage map
the pig fields by their names (not their field order) matching them to the
names in the avro schema?

Thanks,
Ruslan Al-Fakikh


On Sun, Nov 17, 2013 at 6:53 AM, Russell Jurney <russell.jurney@gmail.com>wrote:

> Pig tuples have field order. Swap the order of the fields in your avro
> schema and try again.
>
> On Nov 16, 2013, at 6:19 PM, Ruslan Al-Fakikh <metaruslan@gmail.com>
> wrote:
>
> Hey guys,
>
> When I store with AvroStorage, the names from Pig tuple fields are
> completely ignored. The field values are put to the result file only by
> their position.
> Here is a simplified test case:
>
> %declare WORKDIR `pwd`
> REGISTER ../../../../lib/external/avro-1.7.4.jar
> REGISTER ../../../../lib/external/json-simple-1.1.jar
> --this is build (manually with Maven) from the latest source:
> --
> http://svn.apache.org/viewvc/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/
> REGISTER ../piggybankBuiltFromSource.jar
> REGISTER ../../../../lib/external/jackson-core-asl-1.8.8.jar
> REGISTER ../../../../lib/external/jackson-mapper-asl-1.8.8.jar
>
> --$ cat input.txt
> --data_a data_b
> --data_a data_b
> inputs = LOAD 'input.txt' AS (a: chararray, b: chararray);
>
> DESCRIBE inputs;
> DUMP inputs;
>
> --output:
> --inputs: {a: chararray,b: chararray}
> --(data_a,data_b)
> --(data_a,data_b)
>
> STORE inputs INTO 'output'
>     USING org.apache.pig.piggybank.storage.avro.AvroStorage('{
> "schema":
> {
>   "type" : "record",
>   "name" : "my_schema",
>   "namespace" : "com.my_namespace",
>   "fields" : [
>   {
>     "name" : "b",
>     "type" : "string"
>   },
>   {
>     "name" : "nonsense_name",
>     "type" : "string"
>   }
>   ]
> }
> }');
>
> --output
> --$ java -jar ../../../../lib/external/avro-tools-1.7.4.jar tojson
> output/part*
> --{"b":"data_a","nonsense_name":"data_b"}
> --{"b":"data_a","nonsense_name":"data_b"}
>
> AvroStorage is build from the latest piggybank code.
> Using AvroStorage "debug": 5 parameter didn't help.
>
> $ pig -version
> Apache Pig version 0.11.0-cdh4.3.0 (rexported)
> compiled May 27 2013, 20:48:21
>
> Any help would be appreciated.
>
> Thanks,
> Ruslan Al-Fakikh
>
>

Mime
View raw message