hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17177) Major compaction can break the region/row level atomic when scan even if we pass mvcc to client
Date Fri, 02 Dec 2016 03:52:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713934#comment-15713934

Duo Zhang commented on HBASE-17177:

Have been thinking this for days. I think we should have an option for scan called 'atomicity'
which has three values: {{None}}, {{Row}} and {{Region}}. The default value wil be {{Row}}.

And this will change the way of error handling at client side.

For {{None}}, in general we can recover from any exceptions by reopening a new region scanner,
unless timeout.

For {{Row}}, if allowPartial is enabled and we failed at the middle of a row, then it is not
always safe to reopen a new scanner. We need to do something at the server side. If we get
open new scanner request that have a mvcc read point at RS side, then we need to check if
the read point is larger than or equals to the current smallest read point, or we are in the
'no major compaction period' introduced above, if not we need to tell client that the atomicity
can not be guaranteed and you need to give up.

For {{Region}}, the above thing will also happen even if allowPartial is disabled as we need
cross row atomicity.

And I think the {{None}} here is the same thing of 'stateless' in HBASE-15576.


> Major compaction can break the region/row level atomic when scan even if we pass mvcc
to client
> -----------------------------------------------------------------------------------------------
>                 Key: HBASE-17177
>                 URL: https://issues.apache.org/jira/browse/HBASE-17177
>             Project: HBase
>          Issue Type: Sub-task
>          Components: scan
>            Reporter: Duo Zhang
>             Fix For: 2.0.0, 1.4.0
> We know that major compaction will actually delete the cells which are deleted by a delete
marker. In order to give a consistent view for a scan, we need to use a map to track the read
points for all scanners for a region, and the smallest one will be used for a compaction.
For all delete markers whose mvcc is greater than this value, we will not use it to delete
other cells.
> And the problem for a scan restart after region move is that, the new RS does not have
the information of the scanners opened at the old RS before the client sends scan requests
to the new RS which means the read points map is incomplete and the smallest read point maybe
greater than the correct value. So if a major compaction happens at that time, it may delete
some cells which should be kept.

This message was sent by Atlassian JIRA

View raw message