hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16838) Implement basic scan
Date Fri, 11 Nov 2016 05:43:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15656227#comment-15656227
] 

stack commented on HBASE-16838:
-------------------------------

Looking. Its much nicer now I think. We can work on the one-rpc small scan over in the linked
issue.

Pity names get to be like this: AsyncScanRegionRpcRetryingCaller... which runs a AsyncScanRegionRpcRetryingCallables.
 The RpcRetryingCaller stuff and RpcRetryingCallable was all there before you so not your
fault... just saying. Is a Scan always against a Region (is the Region redundant?).

Looking at the Response, we need Scan in there? The Scan in Response is different from originalScan?
nvm... I see this Scan carries state of the general Scan as we progress.

locateToPreviousRegion is when a reverse Scan?

Thanks for comment on scan timeout. So scan timeout is different to operation timeout? It
is at least according to your comment up in rb: "As now we have heartbeat support for scan,
ideally a scan will never timeout unless the RS is crash. The RS will always return something
before the rpc timeout or scan timeout to tell the client that it is still alive.
The scan timeout is used as operation timeout for every operations in a scan, such as openScanner
or next."

I think you should stick the above comment on the scan timeout so it is clear what the scan
timeout means. It helps.

Update doc on commit:

	   * The basic scan API uses the observer pattern. All results that match the given scan
object will
356	   * be passed to the given {@code scanObserver} by calling {@link ScanConsumer#onNext(Result[])}.

... you changed observer to be a consumer.

Is there example code on how I'd do an async Scan? I create a ScanConsumer and pass it in
then it will get called with Results as the Scan progresses? The AsyncTable#scan returns immediately?
Perhaps stick it in javadoc for the scan method? Is SimpleScanObserver a good example or just
a stop gap with its queue?

Dont kill me but should ScanConsumer be ScanResultConsumer (can do in followup if makes sense)
or just ScanResult?

CompleteResultScanResultCache should be CompleteScanResultCache to match AllowPartialScanResultCache?

I'd be good committing this as is and addressing what remains in follow-on. +1













> Implement basic scan
> --------------------
>
>                 Key: HBASE-16838
>                 URL: https://issues.apache.org/jira/browse/HBASE-16838
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16838-v1.patch, HBASE-16838-v2.patch, HBASE-16838-v3.patch,
HBASE-16838.patch
>
>
> Implement a scan works like the grpc streaming call that all returned results will be
passed to a ScanConsumer. The methods of the consumer will be called directly in the rpc framework
threads so it is not allowed to do time consuming work in the methods. So in general only
experts or the implementation of other methods in AsyncTable can call this method directly,
that's why I call it 'basic scan'.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message