hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-605) allow scanners which return results ordred by a column value
Date Thu, 22 May 2008 04:15:55 GMT

    [ https://issues.apache.org/jira/browse/HBASE-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598902#action_12598902
] 

stack commented on HBASE-605:
-----------------------------

In TestOrderedScanner, do you want to remove commented out code?  Do you want to add a class
comment that says this test depends/uses a 'special' version of HRegionServer.  Would suggest
that all classes that depend on this custom HRegionServer also get marked appropriately in
their class comment (@see?): e.g. OrderedScanner won't work unless its going against the ordered
HRS -- same for OrderedHRegion.

Some classes are missing licenses.

I suppose package protection prevents you putting all these new classes into a new orderedregionserver
package or into a subpackage named regionserver.ordered and client.ordered or some such?

You need to explain somewhere in javadoc what this OrderedRegionServer is, how it works, and
how to enable it.  Would suggest that the class comment in the OrderedRegionServer or in the
Ordered Interface as good places (otherwise, should I put in place a package.html to which
you can add?).  What would be great is that the next time someone shows up asking how they
can customize regionserver behavior, we can just point them to your OrderedRegionServer javadoc
as an example.

Thanks for adding accessors rather than making data members protected in RegionServer and
for making HStore, etc., subclassable.

Otherwise, the patch looks great.


> allow scanners which return results ordred by a column value
> ------------------------------------------------------------
>
>                 Key: HBASE-605
>                 URL: https://issues.apache.org/jira/browse/HBASE-605
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: client, regionserver
>    Affects Versions: 0.2.0
>            Reporter: Clint Morgan
>            Priority: Minor
>         Attachments: hbase-605-v2.patch, hbase-605.patch
>
>
> We would like to be able to scan though tables with results ordered by (deserialized)
column values. This approach maintains an in-memory sorted set for each ordered-by column
in each HStore. This allows us to iterate through the keys in column order, and to random
reads on the key to get the full row.
> Without the index, then we have to scan through all the rows to get the first result
ordered by a column. Thus, when R is the number of rows in a table,  N is the number of ordered-by
rows we want, and R >> N we can save a lot of work by not doing the full table scan.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message