hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Kennedy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1439) Add endRow parameter to HClient#obtainScanner
Date Thu, 28 Jun 2007 20:41:06 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508929
] 

James Kennedy commented on HADOOP-1439:
---------------------------------------

Oh, one thing I forgot to add in the limitations above:

Column criteria can only apply to columns included int he results. You cannot retrieve COL1,
COL2 where COL3 = 'XYZ'
This is because the filtering is happening at the HScanner level and for e.g. the lower level
scanner for  COL3 is not employed and so all COL3's values appear as null.


> Add endRow parameter to HClient#obtainScanner
> ---------------------------------------------
>
>                 Key: HADOOP-1439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1439
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>
> Currently the HClient#obtainScanner looks like this:
> {code}
> public synchronized HScannerInterface obtainScanner(Text[] columns, Text startRow) throws
IOException;
> {code}
> Add an overload that allows specification of endRow:
> {code}
> public synchronized HScannerInterface obtainScanner(Text[] columns, Text startRow, Text
endRow) throws IOException;
> {code}
> Use Case: Table contains the whole web.  Client just wants to scan google's pages.  Currently,
client could cut off the scanner as soon as the row key leaves the google domain but cleaner
if {{HScannerInterface#next()}} returns false

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message