pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joseph Adler" <jad...@linkedin.com>
Subject Re: Review Request: PIG-3015 Rewrite of AvroStorage
Date Mon, 20 May 2013 16:38:01 GMT


> On March 19, 2013, 4:40 p.m., Jonathan Coveney wrote:
> > src/org/apache/pig/builtin/AvroStorage.java, line 352
> > <https://reviews.apache.org/r/8104/diff/4/?file=244837#file244837line352>
> >
> >     I realize using Long's compareTo is convenient, but this seems like unnecessary
boxing. why not just compare them directly? I realize this isn't performance critical cord,
it just stuck out to me, since you could just do a > instead...

For sorting, you need to implement compare (which tests for <, ==, and >). I switched
to com.google.common.primitives.Longs.compare


> On March 19, 2013, 4:40 p.m., Jonathan Coveney wrote:
> > src/org/apache/pig/impl/util/avro/AvroTupleWrapper.java, line 66
> > <https://reviews.apache.org/r/8104/diff/4/?file=244846#file244846line66>
> >
> >     May want to throw an UnsupportedOperationException instead, as if this is being
called, it's a more fundamental issue with Pig, separate from write related issues.

Stuck with the exceptions in the existing Tuple interface... but yes, that would be more logical


> On March 19, 2013, 4:40 p.m., Jonathan Coveney wrote:
> > src/org/apache/pig/impl/util/avro/AvroTupleWrapper.java, line 84
> > <https://reviews.apache.org/r/8104/diff/4/?file=244846#file244846line84>
> >
> >     shouldn't this throw an error? Or is avroObject.put() doing something I don't
expect, perhaps being 1-indexed instead of 0-indexed?

I think that write is never called; in the current version it just throws an error


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8104/#review18077
-----------------------------------------------------------


On Jan. 4, 2013, 7:22 p.m., Joseph Adler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8104/
> -----------------------------------------------------------
> 
> (Updated Jan. 4, 2013, 7:22 p.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Description
> -------
> 
> The current AvroStorage implementation has a lot of issues: it requires old versions
of Avro, it copies data much more than needed, and it's verbose and complicated. (One pet
peeve of mine is that old versions of Avro don't support Snappy compression.)
> 
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the new implementation
is significantly faster, and the code is a lot simpler. Rewriting AvroStorage also enabled
me to implement support for Trevni.
> 
> This is the latest version of the patch, complete with test cases and TrevniStorage.
(Test cases for TrevniStorage are still missing).
> 
> 
> This addresses bug PIG-3015.
>     https://issues.apache.org/jira/browse/PIG-3015
> 
> 
> Diffs
> -----
> 
>   .eclipse.templates/.classpath c7b83b8 
>   ivy.xml 70e8d50 
>   ivy/libraries.properties 7b07c7e 
>   src/org/apache/pig/builtin/AvroStorage.java PRE-CREATION 
>   src/org/apache/pig/builtin/TrevniStorage.java PRE-CREATION 
>   src/org/apache/pig/impl/util/avro/AvroArrayReader.java PRE-CREATION 
>   src/org/apache/pig/impl/util/avro/AvroBagWrapper.java PRE-CREATION 
>   src/org/apache/pig/impl/util/avro/AvroMapWrapper.java PRE-CREATION 
>   src/org/apache/pig/impl/util/avro/AvroRecordReader.java PRE-CREATION 
>   src/org/apache/pig/impl/util/avro/AvroRecordWriter.java PRE-CREATION 
>   src/org/apache/pig/impl/util/avro/AvroStorageDataConversionUtilities.java PRE-CREATION

>   src/org/apache/pig/impl/util/avro/AvroStorageSchemaConversionUtilities.java PRE-CREATION

>   src/org/apache/pig/impl/util/avro/AvroTupleWrapper.java PRE-CREATION 
>   test/commit-tests 5081fbc 
>   test/org/apache/pig/builtin/TestAvroStorage.java PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/directory_test.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_ai1_ao2.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_ao2.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_blank_first_args.pig PRE-CREATION

>   test/org/apache/pig/builtin/avro/code/pig/identity_codec.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/identity_just_ao2.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/namesWithDoubleColons.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/recursive_tests.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/trevni_to_avro.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/code/pig/trevni_to_trevni.pig PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/arrays.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/arraysAsOutputByPig.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordWithRepeatedSubRecords.json PRE-CREATION

>   test/org/apache/pig/builtin/avro/data/json/records.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsAsOutputByPig.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsOfArrays.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsOfArraysOfRecords.json PRE-CREATION

>   test/org/apache/pig/builtin/avro/data/json/recordsSubSchema.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsSubSchemaNullable.json PRE-CREATION

>   test/org/apache/pig/builtin/avro/data/json/recordsWithDoubleUnderscores.json PRE-CREATION

>   test/org/apache/pig/builtin/avro/data/json/recordsWithEnums.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithFixed.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithMaps.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/data/json/recordsWithMapsOfRecords.json PRE-CREATION

>   test/org/apache/pig/builtin/avro/data/json/recordsWithNullableUnions.json PRE-CREATION

>   test/org/apache/pig/builtin/avro/data/json/recursiveRecord.json PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/arrays.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/arraysAsOutputByPig.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordWithRepeatedSubRecords.avsc PRE-CREATION

>   test/org/apache/pig/builtin/avro/schema/records.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsAsOutputByPig.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsOfArrays.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsOfArraysOfRecords.avsc PRE-CREATION

>   test/org/apache/pig/builtin/avro/schema/recordsSubSchema.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsSubSchemaNullable.avsc PRE-CREATION

>   test/org/apache/pig/builtin/avro/schema/recordsWithDoubleUnderscores.avsc PRE-CREATION

>   test/org/apache/pig/builtin/avro/schema/recordsWithEnums.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithFixed.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithMaps.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/recordsWithMapsOfRecords.avsc PRE-CREATION

>   test/org/apache/pig/builtin/avro/schema/recordsWithNullableUnions.avsc PRE-CREATION

>   test/org/apache/pig/builtin/avro/schema/recursiveRecord.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/simpleRecordsTrevni.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/testDirectory.avsc PRE-CREATION 
>   test/org/apache/pig/builtin/avro/schema/testDirectoryCounts.avsc PRE-CREATION 
>   test/unit-tests 7cede06 
> 
> Diff: https://reviews.apache.org/r/8104/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Joseph Adler
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message