hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-605) allow scanners which return results ordred by a column value
Date Mon, 19 May 2008 19:39:55 GMT

    [ https://issues.apache.org/jira/browse/HBASE-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598057#action_12598057

stack commented on HBASE-605:

This patch has much merit if only for the fact that it verifies (after making few tweaks)
that a subclass of HRegionServer is possible.

+ Source is < 80 columns wide in hadoop
+ Should you be subclassing HColumnDescriptor too?  Should it be versioned too?
+ Should we instead add accesors to HRS for the data members you changed from private to protected?
(leases and requestCount)
+ We need to make HRegion subclassable or at least be configurable about which HStore to use?
 HTable too (As is they are 'polluted' with your sorted column code)

> allow scanners which return results ordred by a column value
> ------------------------------------------------------------
>                 Key: HBASE-605
>                 URL: https://issues.apache.org/jira/browse/HBASE-605
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Clint Morgan
>            Priority: Minor
>         Attachments: hbase-605.patch
> We would like to be able to scan though tables with results ordered by (deserialized)
column values. This approach maintains an in-memory sorted set for each ordered-by column
in each HStore. This allows us to iterate through the keys in column order, and to random
reads on the key to get the full row.
> Without the index, then we have to scan through all the rows to get the first result
ordered by a column. Thus, when R is the number of rows in a table,  N is the number of ordered-by
rows we want, and R >> N we can save a lot of work by not doing the full table scan.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message