hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Russell Melick (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-1644) use filter pushdown for automatically accessing indexes
Date Wed, 16 Feb 2011 01:29:57 GMT

     [ https://issues.apache.org/jira/browse/HIVE-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Russell Melick updated HIVE-1644:
---------------------------------

    Attachment: HIVE-1644.1.patch

I'm having trouble joining the operator trees together.  Within
org.apache.hadoop.hive.ql.optimizer.index.IndexWhereProcessor,
I'm working within rewriteForIndex(...).  I added a unit test called
index_opt_where.q, which is what I'll be using for examples.

After line 139, I have 2 parseContexts: 1 is the original parse context
of the normal query 

(SELECT * FROM src WHERE key=86 ORDER BY key), 

and the other is the parseContext of the reentrant query used to
generate the temporary table.

(INSERT OVERWRITE DIRECTORY "/tmp/index_result_where1" SELECT
`_bucketname` ,  `_offsets` FROM default__src_src_index__ WHERE key=86)

I don't really know how to join the operator trees from these
parseContexts together.  I tried just setting the topOps of the original
query to include the topOps of the reentrant query.  I ran into null
pointer exceptions when it went to optimize using other Transforms.

I then tried add the topOps of the original query as the child of the
very bottom child node of the reentrant query, and then setting that
combined operator tree as the topOp of the original parseContext.  That
gave me trouble.  For some reason, the IndexWhereProcessor.process(...)
method is called twice, and the second time, it would try to use the
reentrant topOp table name (default__src_src_index...) to lookup the
tblScan operator, but the new topOp only had src as a TableScan. The new
topOp was the reentrant query, and that doesn't scan it default__src...
table, it creates it.

I've attached a patch, in hopes that it will reasonably easy to see what
I'm talking about.  There are lot of hacks in it (it always uses
indexOptimizing as an Optimization, and the query reconstruction is
bad), but it's stuff I didn't want to worry about until I could get it
working.

> use filter pushdown for automatically accessing indexes
> -------------------------------------------------------
>
>                 Key: HIVE-1644
>                 URL: https://issues.apache.org/jira/browse/HIVE-1644
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.7.0
>            Reporter: John Sichi
>            Assignee: Russell Melick
>         Attachments: HIVE-1644.1.patch
>
>
> HIVE-1226 provides utilities for analyzing filters which have been pushed down to a table
scan.  The next step is to use these for selecting available indexes and generating access
plans for those indexes.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message