pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Smith (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PIG-4793) AvroStorage issues during write into HDFS
Date Mon, 01 Feb 2016 09:52:39 GMT
John Smith created PIG-4793:
-------------------------------

             Summary: AvroStorage issues during write into HDFS
                 Key: PIG-4793
                 URL: https://issues.apache.org/jira/browse/PIG-4793
             Project: Pig
          Issue Type: Bug
          Components: piggybank
            Reporter: John Smith


Dear,

I created the simple pig script that reads two avro files, merges the two relations and store
it into output avro file.

I tried to store the output relation into avro file using:
 store outputSet into 'avrostorage' using AvroStorage();

Some workaround was needed because pig has problems to process schema with :: 

Added code below the result 'avrostorage' file was generated.
outputSet = foreach outputSet generate $0 as (name:chararray) , $1 as (customerId:chararray),
$2 as (VIN:chararray) , $3 as (Birthdate:chararray), $4 as (Mileage:chararray) ,$5 as (Fuel_Consumption:chararray);
 

When I tried to store avro file with the schema definition using code below,
strange error is occurring https://bpaste.net/show/ccf0cbef06a9 (Full log).

...
10.0.1.47:8050 2016-01-29 17:24:39,600 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil
- 1 map reduce job(s) failed!
...

STORE outputSet INTO '/avro-dest/Test-20160129-1401822' 
 USING org.apache.pig.piggybank.storage.avro.AvroStorage('no_schema_check', 'schema', '....')


Sample data and pig script:

https://drive.google.com/file/d/0B6RZ_9vVuTEcd01aWm9zczNUUWc/view


I think these might be two important issues, could you please investigate?

Thank you



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message