hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16994) Region report a last flushed sequence id that is less than the previous last flushed sequence id
Date Wed, 02 Nov 2016 14:33:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629141#comment-15629141

Yu Li commented on HBASE-16994:

I come across HBASE-16721 when checking branch-1 commit history on {{HRegion.java}} and I
think it's a similar issue (but not the same). And I think we could borrow the method from
branch-1 code like below:
    MultiVersionConcurrencyControl.WriteEntry writeEntry = mvcc.begin();
    // wait for all in-progress transactions to commit to WAL before
    // we can start the flush. This prevents
    // uncommitted transactions from being written into HFiles.
    // We have to block before we start the flush, otherwise keys that
    // were removed via a rollbackMemstore could be written to Hfiles.
    // set writeEntry to null to prevent mvcc.complete from being called again inside finally
    // block
    writeEntry = null;
before {{startCacheFlush}} and I think it's safer than clearing the {{oldestUnflushedStoreSequenceIds}}?
Does this way address your concern [~Apache9]?

[~stack] and [~enis], mind take a look here since it's pretty much like HBASE-16721 but some
case we neglected to address for master branch? Thanks.

> Region report a last flushed sequence id that is less than the previous last flushed
sequence id 
> -------------------------------------------------------------------------------------------------
>                 Key: HBASE-16994
>                 URL: https://issues.apache.org/jira/browse/HBASE-16994
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>         Attachments: HBASE-16994_master_v1.patch, HBASE-16994_master_v2.patch
> Since append will be published to RingBuffer and handled asynchronously, it's possible
that one append (say append-X) of the region handled by RingBufferEventHandler between startCacheFlush
and getNextSequenceId, and reset FSHLog#oldestUnflushedStoreSequenceIds which we just cleared
in #startCacheFlush. This might disturb ServerManager#flushedSequenceIdByRegion like shown
below (assume region-A has two CF: cfA and cfB)
> 1. flush-A runs to startCacheFlush and it will flush both cfA and cfB, oldestUnflushedStoreSequenceIds
of regionA got cleared
>  2. append-X on cfB handled by RingBufferEventHandler, oldestUnflushedStoreSequenceIds
set to 10, for example
>  3. flush-A runs to getNextSequenceId and returned 11
>  4. ServerManager#flushedSequenceIdByRegion for regionA set to 11
>  5. flush-A finishes
>  6. flush-B starts and only flush cfA, getNextSequenceId returned 10, and flushedSeqId
will return 9, and cause warning in ServerManager
> Since this append-X will also got flushed, we should clear the oldestUnflushedStoreSequenceIds
again to make sure we won't disturb
>  ServerManager#flushedSequenceIdByRegion.

This message was sent by Atlassian JIRA

View raw message