hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eshcar Hillel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17407) Correct update of maxFlushedSeqId in HRegion
Date Tue, 03 Jan 2017 09:29:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15794617#comment-15794617
] 

Eshcar Hillel commented on HBASE-17407:
---------------------------------------

In HRegion::internalFlushCacheAndCommit() we have the following piece of code
{code}

    // If we get to here, the HStores have been written.
    for(Store storeToFlush :storesToFlush) {
      ((HStore) storeToFlush).finalizeFlush();
    }
    if (wal != null) {
      wal.completeCacheFlush(this.getRegionInfo().getEncodedNameAsBytes());
    }

    // Record latest flush time
    for (Store store: storesToFlush) {
      this.lastStoreFlushTimeMap.put(store, startTime);
    }

    this.maxFlushedSeqId = flushedSeqId;
    this.lastFlushOpSeqId = flushOpSeqId;
{code}

The method finalizeFlush() was added when we added the compacting memstore exactly for the
purpose of updating the WAL with the correct sequence number before the flush operation is
completed.

Indeed, as the code indicates maxFlushSeqId is set with the parameter passed to the method
and is not re-computed based on the current state of the stores.

2 options to fix this
(1) see if we can pass the correct sequence id to the method (maybe we do??)
(2) build upon the finalizeFlush method which finds lowest un-flushed SequenceId per store.
We can return the *lowest unflushed seq id* per store, find the minimum value and then derive
the *maximal flushed seq id*.

> Correct update of maxFlushedSeqId in HRegion
> --------------------------------------------
>
>                 Key: HBASE-17407
>                 URL: https://issues.apache.org/jira/browse/HBASE-17407
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Eshcar Hillel
>
> The attribute maxFlushedSeqId in HRegion is used to track the max sequence id in the
store files and is reported to HMaster. When flushing only part of the memstore content this
value might be incorrect and may cause data loss.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message