pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Coveney" <jcove...@gmail.com>
Subject Re: Review Request: SchemaTuple in Pig
Date Fri, 29 Jun 2012 21:55:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4651/
-----------------------------------------------------------

(Updated June 29, 2012, 9:55 p.m.)


Review request for pig and Julien Le Dem.


Description
-------

This work builds on Dmitriy's PrimitiveTuple work. The idea is that, knowing the Schema on
the frontend, we can code generate Tuples which can be used for fun and profit. In rudimentary
tests, the memory efficiency is 2-4x better, and it's ~15% smaller serialized (heavily heavily
depends on the data, though). Need to do get/set tests, but assuming that it's on par (or
even faster) than Tuple, the memory gain is huge.

Need to clean up the code and add tests.

Right now, it generates a SchemaTuple for every inputSchema and outputSchema given to UDF's.
The next step is to make a SchemaBag, where I think the serialization savings will be really
huge.

Needs tests and comments, but I want the code to settle a bit.


This addresses bug PIG-2632.
    https://issues.apache.org/jira/browse/PIG-2632


Diffs (updated)
-----

  trunk/.gitignore 1355561 
  trunk/conf/pig.properties 1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapBase.java
1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigTupleDefaultRawComparator.java
1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java
1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFRJoin.java
1355561 
  trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java
1355561 
  trunk/src/org/apache/pig/builtin/mock/Storage.java 1355561 
  trunk/src/org/apache/pig/data/AppendableSchemaTuple.java PRE-CREATION 
  trunk/src/org/apache/pig/data/BinInterSedes.java 1355561 
  trunk/src/org/apache/pig/data/BinSedesTupleFactory.java 1355561 
  trunk/src/org/apache/pig/data/DataByteArray.java 1355561 
  trunk/src/org/apache/pig/data/FieldIsNullException.java PRE-CREATION 
  trunk/src/org/apache/pig/data/PBooleanTuple.java 1355561 
  trunk/src/org/apache/pig/data/PDoubleTuple.java 1355561 
  trunk/src/org/apache/pig/data/PFloatTuple.java 1355561 
  trunk/src/org/apache/pig/data/PIntTuple.java 1355561 
  trunk/src/org/apache/pig/data/PLongTuple.java 1355561 
  trunk/src/org/apache/pig/data/PStringTuple.java 1355561 
  trunk/src/org/apache/pig/data/PrimitiveFieldTuple.java 1355561 
  trunk/src/org/apache/pig/data/PrimitiveTuple.java 1355561 
  trunk/src/org/apache/pig/data/SchemaTuple.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleBackend.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleClassGenerator.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleFactory.java PRE-CREATION 
  trunk/src/org/apache/pig/data/SchemaTupleFrontend.java PRE-CREATION 
  trunk/src/org/apache/pig/data/TupleFactory.java 1355561 
  trunk/src/org/apache/pig/data/TupleMaker.java PRE-CREATION 
  trunk/src/org/apache/pig/data/TypeAwareTuple.java 1355561 
  trunk/src/org/apache/pig/data/utils/BytesHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/data/utils/MethodHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/data/utils/SedesHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/data/utils/StructuresHelper.java PRE-CREATION 
  trunk/src/org/apache/pig/impl/PigContext.java 1355561 
  trunk/src/org/apache/pig/impl/io/InterRecordReader.java 1355561 
  trunk/src/org/apache/pig/impl/io/NullableTuple.java 1355561 
  trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1355561

  trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1355561 
  trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 1355561

  trunk/src/org/apache/pig/newplan/logical/relational/LogicalRelationalOperator.java 1355561

  trunk/src/org/apache/pig/newplan/logical/rules/GroupByConstParallelSetter.java 1355561 
  trunk/src/org/apache/pig/newplan/logical/rules/MergeForEach.java 1355561 
  trunk/test/org/apache/pig/data/TestSchemaTuple.java PRE-CREATION 
  trunk/test/org/apache/pig/data/utils/TestMethodHelper.java PRE-CREATION 
  trunk/test/org/apache/pig/test/TestDataBag.java 1355561 
  trunk/test/org/apache/pig/test/TestLogicalPlanBuilder.java 1355561 
  trunk/test/org/apache/pig/test/TestPrimitiveFieldTuple.java 1355561 
  trunk/test/org/apache/pig/test/TestPrimitiveTuple.java 1355561 
  trunk/test/org/apache/pig/test/TestSchema.java 1355561 

Diff: https://reviews.apache.org/r/4651/diff/


Testing
-------


Thanks,

Jonathan Coveney


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message