hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Kennedy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1439) Add endRow parameter to HClient#obtainScanner
Date Thu, 28 Jun 2007 01:11:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508701
] 

James Kennedy commented on HADOOP-1439:
---------------------------------------

Right, so in the case of >, =, < type RowFilters you're quite right. More generally
a RowFilter implementing those functions or otherwise may need to signal the scanner to stop
altogether for whatever reason, even when the target rows are not located in a single consecutive
chunk like >, =. <.  e.g. reached a maximum of nonconsecutive matched rows.

I'll implement this mechanism, clean up, and re-post the Hadoop-1531 patch when i get a chance.

That will make RowFilter more conducive to the EndRow filtering needed for this task. But
as I said there will still be a little overhead vs. implementing an explicit endRow param
to the scanner. 

> Add endRow parameter to HClient#obtainScanner
> ---------------------------------------------
>
>                 Key: HADOOP-1439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1439
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>            Priority: Minor
>
> Currently the HClient#obtainScanner looks like this:
> {code}
> public synchronized HScannerInterface obtainScanner(Text[] columns, Text startRow) throws
IOException;
> {code}
> Add an overload that allows specification of endRow:
> {code}
> public synchronized HScannerInterface obtainScanner(Text[] columns, Text startRow, Text
endRow) throws IOException;
> {code}
> Use Case: Table contains the whole web.  Client just wants to scan google's pages.  Currently,
client could cut off the scanner as soon as the row key leaves the google domain but cleaner
if {{HScannerInterface#next()}} returns false

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message