hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sichi" <jsi...@fb.com>
Subject Re: Review Request: HIVE-1644 Use filter pushdown for automatically accessing indexes
Date Fri, 15 Apr 2011 20:34:07 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/558/#review482
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/558/#comment982>

    A few comments here.
    
    1) Rather than passing in the entire table scan object and letting the handler set properties
on it, I think we should just have the handler pass back the necessary information (input
format and intermediate file).
    
    2) The generateIndexQuery method's parameter list is growing.  For plugin interfaces,
a good pattern we've been using in other places is to introduce a new context class (say HiveIndexQueryContext)
with getters and setters for the information to be communicated in both directions.  Then
the caller instantiates one of these and passes in an instance.  The plugin reads and writes
to the context.  On return, the caller gets the modified information out.  The main benefit
is that in the future, if we need to pass more information, we just add new members to the
context class, and none of the existing plugin implementations break.
    
    In this case, you could also put the context objects in a map (instead of having to keep
multiple maps indexQueryTasks/additionalInputs etc).
    



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/558/#comment983>

    Just put it as a TODO for now; create the followup JIRA issue and reference it in the
TODO.
    



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
<https://reviews.apache.org/r/558/#comment990>

    Look in Hive.java; there are methods like
    
    public List<Partition> getPartitionsByNames(Table tbl, List<String> partNames)
    
    which look up the actual partitions for a table from the metastore.  You can pass in indexTable.
    



ql/src/test/results/clientpositive/index_opt_where_partitioned.q.out
<https://reviews.apache.org/r/558/#comment991>

    Hmm...what if we could avoid relabeling altogether?  If you look in Driver.java, there's
a method compile which calls TaskFactory.resetId().  This is what causes us to start back
over from 0.
    
    If you add an optional parameter resetTaskIds=true, and then pass false for the Driver
instance used for compiling the reentrant query, that might do it.
    
    
    


- John


On 2011-04-15 08:08:14, Russell Melick wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/558/
> -----------------------------------------------------------
> 
> (Updated 2011-04-15 08:08:14)
> 
> 
> Review request for hive.
> 
> 
> Summary
> -------
> 
> Review request for HIVE-1644.12.patch
> 
> 
> This addresses bug HIVE-1644.
>     https://issues.apache.org/jira/browse/HIVE-1644
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a21f589 
>   conf/hive-default.xml c42197f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 6437385 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java c02d90b 
>   ql/src/java/org/apache/hadoop/hive/ql/index/AbstractIndexHandler.java dd0186d 
>   ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexHandler.java 411b78f 
>   ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 1f01446

>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 50db44c 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java 6162676 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/IndexWhereResolver.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/PhysicalOptimizer.java 0ae9fa2

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcCtx.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 937a7b3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java f0aca84 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 73391e9 
>   ql/src/test/queries/clientpositive/index_opt_where.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_opt_where_partitioned.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/index_opt_where_simple.q PRE-CREATION 
>   ql/src/test/results/clientpositive/index_opt_where.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_opt_where_partitioned.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/index_opt_where_simple.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/558/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Russell
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message