hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Yang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-15576) Scanning cursor to prevent blocking long time on ResultScanner.next()
Date Tue, 06 Jun 2017 07:50:18 GMT

     [ https://issues.apache.org/jira/browse/HBASE-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Phil Yang updated HBASE-15576:
    Attachment: HBASE-15576.v07.patch

> Scanning cursor to prevent blocking long time on ResultScanner.next()
> ---------------------------------------------------------------------
>                 Key: HBASE-15576
>                 URL: https://issues.apache.org/jira/browse/HBASE-15576
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>             Fix For: 2.0.0, 1.4.0
>         Attachments: HBASE-15576.branch-1.v01.patch, HBASE-15576.branch-1.v01.patch,
HBASE-15576.branch-1.v01.patch, HBASE-15576.branch-1.v02.patch, HBASE-15576.v01.patch, HBASE-15576.v02.patch,
HBASE-15576.v03.patch, HBASE-15576.v03.patch, HBASE-15576.v04.patch, HBASE-15576.v04.patch,
HBASE-15576.v05.patch, HBASE-15576.v06.patch, HBASE-15576.v07.patch
> After 1.1.0 released, we have partial and heartbeat protocol in scanning to prevent responding
large data or timeout. Now for ResultScanner.next(), we may block for longer time larger than
timeout settings to get a Result if the row is very large, or filter is sparse, or there are
too many delete markers in files.
> However, in some scenes, we don't want it to be blocked for too long. For example, a
web service which handles requests from mobile devices whose network is not stable and we
can not set timeout too long(eg. only 5 seconds) between mobile and web service. This service
will scan rows from HBase and return it to mobile devices. In this scene, the simplest way
is to make the web service stateless. Apps in mobile devices will send several requests one
by one to get the data until enough just like paging a list. In each request it will carry
a start position which depends on the last result from web service. Different requests can
be sent to different web service server because it is stateless.
> Therefore, the stateless web service need a cursor from HBase telling where we have scanned
in RegionScanner when HBase client receives an empty heartbeat. And the service will return
the cursor to mobile device although the response has no data. In next request we can start
at the position of cursor, without the cursor we have to scan from last returned result and
we may timeout forever. And of course even if the heartbeat message is not empty we can still
use cursor to prevent re-scan the same rows/cells which has beed skipped.
> Obviously, we will give up consistency for scanning because even HBase client is also
stateless, but it is acceptable in this scene. And maybe we can keep mvcc in cursor so we
can get a consistent view?
> HBASE-13099 had some discussion, but it has no further progress by now.
> API:
> In Scan we need a new method setNeedCursorResult(true) to get the cursor row key when
there is a RPC response but client can not return any Result. In this mode we will not block
ResultScanner.next() longer than this timeout setting.
> {code}
> while (r = scanner.next() && r != null) {
>   if(r.isCursor()){
>   // scanning is not end, it is a cursor, save its row key and close scanner if you want,
>   // just continue the loop to call next().
>   } else {
>   // just like before
>   }
> }
> // scanning is end
> {code}

This message was sent by Atlassian JIRA

View raw message