pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Guo <guolin2...@gmail.com>
Subject Re: comments appreciated for pig AvroStorage
Date Wed, 01 Dec 2010 21:07:36 GMT
Yes, we are well aware of the two jiras (refer to the related work
section of the doc in
http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data).

We couldn't use the patch in AVRO-592 because we want to process avro
data generated by our tracking system (or any arbitrary avro data)
which is not supported by the patch.

============ from AVRO-592 ====================
The current restriction is that you can't read an arbitrary Avro
record and make a Tuple out of it, even though the total number of
possible Avro schemas that can be coerced into a Tuple is much larger
than supported, I wanted to support that in a separate place.
=============================================

Best,
Lin

On Wed, Dec 1, 2010 at 10:16 AM, Scott Carey <scott@richrelevance.com> wrote:
> There are two other JIRAs with alternate Avro<-->Pig implementations with different
feature sets.
>
> https://issues.apache.org/jira/browse/PIG-794 aims to use Avro internally within Pig
for efficiency, including intermediate serializatoin.
>
> https://issues.apache.org/jira/browse/AVRO-592 has the same goals that your patch does,
but has fewer restrictions on what can and can't be written/read.  It supports writing any
Pig schema and reading it back in, but only reading a subset of Avro schemas (non-recursive;
I may add unions later).  With a little more work it could support intermediate serialization
for pig as well.   Longer term goals include being able to use AvroStorage along with a Hive
AvroSerDe on the same data, supporting projection, and supporting partitioning.
>
> I've been hoping to finish up AVRO-592 but am currently busy with other things.
>
> -Scott
>
> On Nov 30, 2010, at 9:05 PM, Lin Guo wrote:
>
>> Hi,
>>
>> We'd like to patch our pig AvroStorage function and
>> would highly appreciate any kinds of comments.
>>
>> doc:
>> http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data
>>
>> jira:
>> https://issues.apache.org/jira/browse/PIG-1748
>>
>> Many thanks,
>> Lin
>
>

Mime
View raw message