hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kannan Muthukkaruppan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6014) Support for block-granularity bitmap indexes
Date Tue, 05 Jun 2012 18:16:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289619#comment-13289619
] 

Kannan Muthukkaruppan commented on HBASE-6014:
----------------------------------------------

Can't think of one that'll immediately benefit from this. So this will be low-pri for us too
right now.
                
> Support for block-granularity bitmap indexes
> --------------------------------------------
>
>                 Key: HBASE-6014
>                 URL: https://issues.apache.org/jira/browse/HBASE-6014
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>            Reporter: Todd Lipcon
>         Attachments: 6014-bitmap-hacking.txt, bitmap-hacking.txt
>
>
> This came up in a discussion with Kannan today, so I promised to write something brief
on JIRA -- this was suggested as a potential summer intern project. The idea is as follows:
> We have several customers who periodically run full table scan MR jobs against large
HBase tables while applying fairly restrictive predicates. The predicates are often reasonably
simple boolean expressions across known columns, and those columns often are enum-typed or
otherwise have a fairly restricted range of values. For example, a real time process may mark
rows as dirty, and a background MR job may scan for dirty rows in order to perform further
processing like rebuilding inverted indexes.
> One way to speed up this type of query is to add bitmap indexes. In the context of HBase,
I would envision this as a new type of metadata block included in the HFile which has a series
of tuples: (qualifier, value range, compressed bitmap). A 1 bit in the bitmap indicates that
the corresponding HFile block has at least one cell for which a column with the given qualifier
falls within the given range. Queries which have an equality or comparison predicate against
an indexed qualifier can then use the bitmap index to seek directly to those blocks which
may contain relevant data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message