hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-794) Use Avro serialization in Pig
Date Tue, 07 Sep 2010 04:44:35 GMT

    [ https://issues.apache.org/jira/browse/PIG-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906671#action_12906671
] 

Jeff Zhang commented on PIG-794:
--------------------------------

Dmitriy,

In my patch I turn InternalMap as an avro array whose element is a record having two datums(one
is key and the other is value).
But it occurred weird exception , not know what's wrong with my code 


{code}
Exception in thread "main" java.lang.NullPointerException
	at org.apache.avro.io.parsing.Parser.advance(Parser.java:86)
	at org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:121)
	at org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:77)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
	at org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:66)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
	at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:184)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:108)
	at org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:81)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
	at org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:184)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:108)
	at org.apache.pig.impl.io.avro.PigDataRecordReader.readRecord(PigDataRecordReader.java:83)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:106)
	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:97)
	at org.apache.avro.file.DataFileStream.next(DataFileStream.java:198)
	at org.apache.avro.file.DataFileStream.next(DataFileStream.java:185)
	at org.apache.pig.impl.io.avro.PigData.main(PigData.java:224)

{code}

> Use Avro serialization in Pig
> -----------------------------
>
>                 Key: PIG-794
>                 URL: https://issues.apache.org/jira/browse/PIG-794
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.2.0
>            Reporter: Rakesh Setty
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: avro-0.1-dev-java_r765402.jar, AvroStorage.patch, AvroStorage_2.patch,
AvroStorage_3.patch, AvroStorage_4.patch, AvroTest.java, jackson-asl-0.9.4.jar, PIG-794.patch
>
>
> We would like to use Avro serialization in Pig to pass data between MR jobs instead of
the current BinStorage. Attached is an implementation of AvroBinStorage which performs significantly
better compared to BinStorage on our benchmarks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message