hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16994) Region report a last flushed sequence id that is less than the previous last flushed sequence id
Date Wed, 02 Nov 2016 13:59:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629054#comment-15629054

Duo Zhang commented on HBASE-16994:

Thanks for pointing out this, I think the stage to reproduce the bug is correct.

On the fix, I think we need to do the reset work after fencing mvcc? Otherwise you can not
make sure whether the RingBufferEventHandler has done the sequence id accounting work. And
if we do not have such a fencing when flush, then I think this is a very critical bug that
we may lose data...

> Region report a last flushed sequence id that is less than the previous last flushed
sequence id 
> -------------------------------------------------------------------------------------------------
>                 Key: HBASE-16994
>                 URL: https://issues.apache.org/jira/browse/HBASE-16994
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>         Attachments: HBASE-16994_master_v1.patch, HBASE-16994_master_v2.patch
> Since append will be published to RingBuffer and handled asynchronously, it's possible
that one append (say append-X) of the region handled by RingBufferEventHandler between startCacheFlush
and getNextSequenceId, and reset FSHLog#oldestUnflushedStoreSequenceIds which we just cleared
in #startCacheFlush. This might disturb ServerManager#flushedSequenceIdByRegion like shown
below (assume region-A has two CF: cfA and cfB)
> 1. flush-A runs to startCacheFlush and it will flush both cfA and cfB, oldestUnflushedStoreSequenceIds
of regionA got cleared
>  2. append-X on cfB handled by RingBufferEventHandler, oldestUnflushedStoreSequenceIds
set to 10, for example
>  3. flush-A runs to getNextSequenceId and returned 11
>  4. ServerManager#flushedSequenceIdByRegion for regionA set to 11
>  5. flush-A finishes
>  6. flush-B starts and only flush cfA, getNextSequenceId returned 10, and flushedSeqId
will return 9, and cause warning in ServerManager
> Since this append-X will also got flushed, we should clear the oldestUnflushedStoreSequenceIds
again to make sure we won't disturb
>  ServerManager#flushedSequenceIdByRegion.

This message was sent by Atlassian JIRA

View raw message