hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15325) ResultScanner allowing partial result will miss the rest of the row if the region is moved between two rpc requests
Date Tue, 08 Mar 2016 09:50:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184731#comment-15184731
] 

Anoop Sam John commented on HBASE-15325:
----------------------------------------

{quote}
eg. if client receives 3+5+3 cells in three rpc requests, we should return 5+5+1 cells per
batch to user, all of them is not partial, right?
What we should guarantee when user setAllowPartial and setBatch? If user get three results
3+5+3, is the second result partial? Is the last one partial? And what we should guarantee
in the last partial result when user setAllowPartial only? Now the last one is true if I am
not wrong.
{quote}
Good Qs..
Yes am also +1 for returning 5+5+1 when setAllowPartial = false. (As per ur eg:)..  So here
no Q of partial flag
Now coming to case where setAllowPartial = true..  It is complicated now :-)
I think we can make like mark 1st Result (3 cells) as partial. Second is not partial when
we consider that alone. And obviously last one is any way not partial.   This is how we see
the results. I mean the HBase client see the results.
When it is sent to use app level he may have to work on these and adjust the partial flags..
 It depends on how he will handle the partial results. If he want to merge partial on his
end, we mark 1st one as partial means he know he has to wait for next Result. This comes with
5 (and so we mark it as full). So he will take 2 cells from that and merge with 1st.  So at
this moment the 2nd Result becomes partial and so he know we has to see the 3rd result and
merge it.  As that is the last Result he completes the things there.   
But he specified setPartial to true.. Which means he may be consuming the results like how
it comes. Or else he could have give setPartial false only?

So from HBase client side, we can follow this way of setting the flag IMHO.  I believe already
we will be doing this way.

> ResultScanner allowing partial result will miss the rest of the row if the region is
moved between two rpc requests
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15325
>                 URL: https://issues.apache.org/jira/browse/HBASE-15325
>             Project: HBase
>          Issue Type: Bug
>          Components: dataloss, Scanners
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>            Priority: Critical
>         Attachments: 15325-test.txt, HBASE-15325-v1.txt, HBASE-15325-v2.txt, HBASE-15325-v3.txt,
HBASE-15325-v5.txt, HBASE-15325-v6.1.txt, HBASE-15325-v6.2.txt, HBASE-15325-v6.3.txt, HBASE-15325-v6.4.txt,
HBASE-15325-v6.5.txt, HBASE-15325-v6.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for one rpc
request. And client can setAllowPartial or setBatch to get several cells in a row instead
of the whole row.
> However, the status of the scanner is saved on server and we need this to get the next
part if there is a partial result before. If we move the region to another RS, client will
get a NotServingRegionException and open a new scanner to the new RS which will be regarded
as a new scan from the end of this row. So the rest cells of the row of last result will be
missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message