hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Remus Rusanu" <rem...@microsoft.com>
Subject Review Request 13059: HIVE-4850 Implement vector mode map join
Date Tue, 30 Jul 2013 11:11:18 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13059/
-----------------------------------------------------------

Review request for hive, Eric Hanson and Jitendra Pandey.


Bugs: HIVE-4850
    https://issues.apache.org/jira/browse/HIVE-4850


Repository: hive-git


Description
-------

This is not the final iteration, but I thought is easier to discuss it with a review.
This implementation works, handles multiple aliases and multiple values per key. The implementation
uses the exiting hash tables saved by the local task for the map join, which are row mode
hash tables (have row mode keys and store row mode writable object values). Going forward
we should avoid the size-of-big-table conversions of big table keys to row-mode and conversion
of small table values to vector data. This would require either converting on-the-fly the
hash tables to vector friendly ones (when loaded) or changing the local task tahstable sink
to create a vectorization friendly hash. First approach may have memory consumption problems
(potentially two hash tables end up in memory, would have to stream the transformation or
transform as reading from serialized format... nasty).


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 82d4b93 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 31dbf41 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 4da1be8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 29de38d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java e579c00 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java d774226 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java 791bb3f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java 58a9dc0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java 4bff936 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 8b4c615 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssign.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java PRE-CREATION

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExecMapper.java 083b9b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java PRE-CREATION

  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapOperator.java 41d2001 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 9c90230 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.java ff13f89 
  ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
9e189c9 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableDummyDesc.java f15ce48 

Diff: https://reviews.apache.org/r/13059/diff/


Testing
-------

Manually run some join queries on alltypes_orc table.


Thanks,

Remus Rusanu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message