hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-1803) Implement bitmap indexing in Hive
Date Wed, 23 Mar 2011 22:11:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010482#comment-13010482
] 

He Yongqiang commented on HIVE-1803:
------------------------------------

Did an offline discussion with namit on this jira. 

The basic question is how to use this bitmap indexing. Given there are millions of rows in
one block, the block will contain all distinct values this column has. So the bitmap index
will not be very useful. A possibly use case maybe do a bitmap and/or. eg, need to find out
all records about Male in Japan. Male and Japan are both bitmap indexed. what we can do today
is to first do a JOIN and BITMAP AND operation on the 2 index tables, and then find all the
matching blocks, which is ok, but there requires a join operation. If we can support an bitmap
index with more than 1 index columns, it will help in this case. I mean each index column
in the index table has its own bitmap. Eg, FILE_NAME, BLK_OFFSET, GENDER, bitmapForGENDER,
COUNTY, bitmapForCountry. bitmapForGENDER will have two bitmaps internally, one for Male,
one for Female. And bitmapForCountry will have bitmaps for each country.

And if hive can support skip rows, the bitmap index will be very useful. I mean with bitmap
indexing, block pruning maybe not good enough. For example, in a block, we only find the row1,
row3, lastRow satisfy the predicate. We can just skip row2, and row4 to lastRow-1.


what do you think?

> Implement bitmap indexing in Hive
> ---------------------------------
>
>                 Key: HIVE-1803
>                 URL: https://issues.apache.org/jira/browse/HIVE-1803
>             Project: Hive
>          Issue Type: New Feature
>          Components: Indexing
>            Reporter: Marquis Wang
>            Assignee: Marquis Wang
>         Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch,
HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, JavaEWAH_20110304.zip, bitmap_index_1.png,
bitmap_index_2.png, javaewah.jar, javaewah.jar
>
>
> Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message