hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9203) Secondary index support through coprocessors
Date Tue, 20 Aug 2013 12:39:54 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744928#comment-13744928

Anoop Sam John commented on HBASE-9203:

bq.Since the midpoint for index table region may not be chosen for the split, it is possible
that the daughter regions of index region may have (quite) different amount of data. How can
we mitigate this effect ?

I think this wont happen. The daughter regions of index region will have similar size proportions
as that of the actual table region. For an actual table region there are 10 entries in that
and now its is split as 6,4.  Consider there one index for the data. The index region before
the split will contain 10 entries in it and after the spilt the daugthers will have 6,4 entries
each.   Only diff will be the way the half file reading will happen. In case of normal table
there is a clear split point wrt RK and the readers can readup split point/ read from split
point. But for the index region, both the daugther region readers need to start from the begin
position and check whether each entry belongs to it or not and traverse.  After a split the
compaction will happen using the HalfFileReader and split it into 2 physical files. So the
reader overhead is only temporal.
> Secondary index support through coprocessors
> --------------------------------------------
>                 Key: HBASE-9203
>                 URL: https://issues.apache.org/jira/browse/HBASE-9203
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.98.0
>            Reporter: rajeshbabu
>            Assignee: rajeshbabu
>         Attachments: SecondaryIndex Design.pdf
> We have been working on implementing secondary index in HBase and open sourced  on hbase
0.94.8 version.
> The project is available on github.
> https://github.com/Huawei-Hadoop/hindex
> This Jira is to support secondary index on trunk(0.98).
> Following features will be supported.
> -          multiple indexes on table,
> -          multi column index,
> -          index based on part of a column value,
> -          equals and range condition scans using index, and
> -          bulk loading data to indexed table (Indexing done with bulk load)
> Most of the kernel changes needed for secondary index is available in trunk. Very minimal
changes needed for it.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message