Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 23 Mar 2015 23:39:54 +0000 (UTC)
From: "Jonathan Lawlor (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12728285.1405720476000.3677.1427153994652@Atlassian.JIRA>
In-Reply-To: <JIRA.12728285.1405720476000@Atlassian.JIRA>
References: <JIRA.12728285.1405720476000@Atlassian.JIRA>
 <JIRA.12728285.1405720476875@arcas>
Subject: [jira] [Commented] (HBASE-11544) [Ergonomics]
 hbase.client.scanner.caching is dogged and will try to return batch even if
 it means OOME
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376919#comment-14376919 ] 

Jonathan Lawlor commented on HBASE-11544:
-----------------------------------------

[~apurtell] [~lhofhansl] Thanks for bringing up these discussion points, I have included some discussion below about the design decisions made here and it would be great to hear your thoughts on them.

bq. If scanning millions of rows, millions of objects?

Ya

bq. The size estimations are done up in RSRpcServices

To avoid out of memory errors that resulted from very large rows, the size calculation was pushed all the way down into StoreScanner to be performed between cells (rather than between rows in RSRpcServices). This meant that we may reach the size limit in the middle of a row and form a partial result.

With the size calculation pushed all the way down to StoreScanner, we needed some way of communicating upwards to the RegionScanner and RSRpcServices when a partial result is formed (i.e. we reach the size limit in the middle of a row). At first, the intention was to NOT change the return type from boolean. However, the implementation with the boolean return type ended up requiring many repetitions of the size calculation. 

With the boolean return type, the RegionScanner and RSRpcServices both needed to calculate the result size (in addition to the calculation that had been pushed down to StoreScanner). RegionScanner and RSRpcServices needed to do this in order to check whether or not the size limit had been reached since there was no way to communicate this understanding upwards with a boolean that indicates more values exists. The problems with this approach were:
* The size calculation was being repeated too much
* The state was not explicit enough. Cells were being returned from StoreScanner and it was up to the caller of StoreScanner#next to figure out why these were the cells being returned (size limit reached? batch limit reached?). The only way for the state to bubble up from the StoreScanner was to repeat all of the logic that made the StoreScanner return those Cells.

NextState was introduced to make this communication more explicit and avoid replication of size calculations. 

Any alternative approaches are welcomed. If there is a way to keep the boolean return type and avoid replication of the size calculation, we could certainly try that alternative. Or, if repeating the size calculation is less costly than the NextState, perhaps we should go down that route.

> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11544
>                 URL: https://issues.apache.org/jira/browse/HBASE-11544
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Jonathan Lawlor
>            Priority: Critical
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-11544-branch_1_0-v1.patch, HBASE-11544-branch_1_0-v2.patch, HBASE-11544-v1.patch, HBASE-11544-v2.patch, HBASE-11544-v3.patch, HBASE-11544-v4.patch, HBASE-11544-v5.patch, HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v7.patch, HBASE-11544-v8-branch-1.patch, HBASE-11544-v8.patch, gc.j.png, hits.j.png, mean.png, net.j.png
>
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the client whatever we've gathered once we pass out a certain size threshold rather than keep accumulating till we OOME.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)