hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clint Morgan (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-605) allow scanners which return results ordred by a column value
Date Tue, 29 Apr 2008 21:55:56 GMT

     [ https://issues.apache.org/jira/browse/HBASE-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Clint Morgan updated HBASE-605:
-------------------------------

    Attachment: hbase-605.patch

This patch contains a minimal implementation, and small unit test.

One known deficiency is that the sorted set is build twice per hregion upon splitting. This
is due to hergions being opened then immediately closed upon a split.

I've tested a bit more in our layers above hbase and it works for me (so far). 

> allow scanners which return results ordred by a column value
> ------------------------------------------------------------
>
>                 Key: HBASE-605
>                 URL: https://issues.apache.org/jira/browse/HBASE-605
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Clint Morgan
>            Priority: Minor
>         Attachments: hbase-605.patch
>
>
> We would like to be able to scan though tables with results ordered by (deserialized)
column values. This approach maintains an in-memory sorted set for each ordered-by column
in each HStore. This allows us to iterate through the keys in column order, and to random
reads on the key to get the full row.
> Without the index, then we have to scan through all the rows to get the first result
ordered by a column. Thus, when R is the number of rows in a table,  N is the number of ordered-by
rows we want, and R >> N we can save a lot of work by not doing the full table scan.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message