hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests
Date Wed, 16 Mar 2016 05:40:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196799#comment-15196799
] 

Anoop Sam John commented on HBASE-15325:
----------------------------------------

Seems u have not done the one optimization way while batching is used (The one u suggested)
 Dont want it? Its ok any way.
{code}
if (this.lastResult.isPartial() || scan.getBatch() > 0 ) {
508	            updateLastCellLoadedToCache(this.lastResult);
509	          }
{code}
My suggestion was add an else case and update the lastCellLoadedToCache to be null.   Add
a clear comment in the variable that this is the last cell from a not full Row which is added
to cache (And so returned/ may return to user)  May be some better name also we can get? May
that will be too long name.
So this check can be changed
{code}
 if (this.lastResult != null && (this.lastResult.isPartial() || scan.getBatch() >
0 )) {
496	            rs = filterLoadedCell(rs);
{code}
To
if(this.lastCellLoadedToCache != null) {
   rs = filterLoadedCell(rs);
{code}

{code}
List<Cell> list = new ArrayList<>(result.rawCells().length - index);
858	    for (; index < result.rawCells().length; index++) {
859	      list.add(result.rawCells()[index]);
860	    }
{code}
can u change it to array and add cells to that and use Result.create API which takes array?
Small optimization. Still worth it as no extra lines or so.


> ResultScanner allowing partial result will miss the rest of the row if the region is
moved between two rpc requests
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15325
>                 URL: https://issues.apache.org/jira/browse/HBASE-15325
>             Project: HBase
>          Issue Type: Bug
>          Components: dataloss, Scanners
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>            Priority: Critical
>             Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.4.0
>
>         Attachments: 15325-test.txt, HBASE-15325-v1.txt, HBASE-15325-v10.patch, HBASE-15325-v2.txt,
HBASE-15325-v3.txt, HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt,
HBASE-15325-v6.4.txt, HBASE-15325-v6.5.txt, HBASE-15325-v6.txt, HBASE-15325-v7.patch, HBASE-15325-v8.patch,
HBASE-15325-v9.patch
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for one rpc
request. And client can setAllowPartial or setBatch to get several cells in a row instead
of the whole row.
> However, the status of the scanner is saved on server and we need this to get the next
part if there is a partial result before. If we move the region to another RS, client will
get a NotServingRegionException and open a new scanner to the new RS which will be regarded
as a new scan from the end of this row. So the rest cells of the row of last result will be
missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message