hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Lawlor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13215) A limit on the raw key values is needed for each next call of a scanner
Date Thu, 12 Mar 2015 16:20:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358897#comment-14358897

Jonathan Lawlor commented on HBASE-13215:

Hey [~heliangliang], this sounds like it may be related/solved by the solution for HBASE-13090
that I am currently looking into. 

Over in that issue we have discussed bounding the execution time of scans on the server side.
The idea is to set a time limit for the execution of the scan on the server and once that
time limit is reached we send back a heartbeat/keep-alive message back to the client with
whatever data we have accumulated thus far (potentially empty). This would prevent the scanner
from timing out in the instance that you have described (where many cells have been deleted
or filtered out). The client scanner would then issue future RPCs to the server if the data
within the heartbeat/keep-alive message was not sufficient to service the applications request.

This mechanism is also nice because it is somewhat invisible to the application layer. All
time limit handling is handled server side and the limit on the execution is implied through
the defined client side scanner timeout. Thus, the application layer won't need to specify
any additional limits; the timeouts will be handled automatically. 

> A limit on the raw key values is needed for each next call of a scanner
> -----------------------------------------------------------------------
>                 Key: HBASE-13215
>                 URL: https://issues.apache.org/jira/browse/HBASE-13215
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>            Reporter: He Liangliang
>            Assignee: He Liangliang
> In the current scanner next, there are several limits: caching, batch and size. But there
is no limit on raw data scanned, so the time consumed by a next call is unbounded. For example,
many consecutive deleted or filtered out cells will leads to a socket timeout. This can make
user code get stuck.

This message was sent by Atlassian JIRA

View raw message