hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Russell Melick (JIRA)" <>
Subject [jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage
Date Fri, 20 May 2011 05:39:47 GMT


Russell Melick commented on HIVE-2036:

To expand a bit on Marquis' comments.

In CompactIndexHandler.getIndexPredicateAnalyzer(), we instantiate a predicate analyzer. 
My theory is that you're going to want a whole new PredicateAnalyzer class to deal with bitmaps,
and then you'll instantiate it in a very similar way inside BitmapIndexHandler.  You can also
see here how we only search for columns on which we have indexes.  This is going to need to
be modified, since it currently only allows columns from a single index.

You may also want to rewrite some of the logic in IndexWhereProcessor.process():110.  It currently
loops through every index available and asks it to do a rewrite.  Perhaps it should loop through
every index type and try to find the rewrites possible only using indexes of that type.

If you look at IndexPredicateAnalyzer:123, you can see where it's making sure that all the
parent operators are AND operations.  It should be easy to modify this to allow OR operations,
but I'm not sure that simply allowing them and using the current system will maintain logical
correctness.  It's probably better to start off with just AND's.

The pushedPredicate is the important thing returned by the predicate analyzer.  The pushed
predicate is what it was able to recognize/process.  That's the tree you'll want to use to
generate the bitmap query.  The residual predicate is what it couldn't process. There's a
separate JIRA open (HIVE-2115) to use the residual to cut down on remaining work.

The query generation lives in the IndexHandlers.generateIndexQuery(...).  You'll definitely
need more logic than the simple call to decomposedPredicate.pushedPredicate.getExprString()
that is in the CompactIndexHandler.

There are a few spots where hive.index.compact.file is used.  These may need generalized.
 However, Marquis may have already taken care of this with the bitmap stuff.  I don't remember
what the new name for it was (I think it's hive.index.blockfilter.file), but it's probably
easiest to look in one of his unit tests for it.

The last thing I can think of is that having multiple index types on a single table, or queries
that use multiple tables may become an issue.  I created HIVE-2128 to deal with the multiple

Good luck!

> Update bitmap indexes for automatic usage
> -----------------------------------------
>                 Key: HIVE-2036
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.8.0
>            Reporter: Russell Melick
>            Assignee: Jeffrey Lym
> HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support.
 The bitmap code will need to be extended after it is committed to enable automatic use of
indexing.  Most work will be focused in the BitmapIndexHandler, which needs to generate the
re-entrant QL index query.  There may also be significant work in the IndexPredicateAnalyzer
to support predicates with OR's, instead of just AND's as it is currently.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message