hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey Zhong (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-11099) Two situations where we could open a region with smaller sequence number
Date Wed, 30 Apr 2014 01:46:17 GMT
Jeffrey Zhong created HBASE-11099:

             Summary: Two situations where we could open a region with smaller sequence number
                 Key: HBASE-11099
                 URL: https://issues.apache.org/jira/browse/HBASE-11099
             Project: HBase
          Issue Type: Bug
          Components: regionserver
    Affects Versions: 0.99.0
            Reporter: Jeffrey Zhong

Recently I happened to run into code where we potentially could open region with smaller sequence

1) Inside function: HRegion#internalFlushcache. This is due to we change the way WAL Sync
where we use late binding(assign sequence number right before wal sync).
The flushSeqId may less than the change sequence number included in the flush which may cause
later region opening code to use a smaller than expected sequence number when we reopen the
flushSeqId = this.sequenceId.incrementAndGet();

2) HRegion#replayRecoveredEdits where we have following code:
          if (coprocessorHost != null) {
            status.setStatus("Running pre-WAL-restore hook in coprocessors");
            if (coprocessorHost.preWALRestore(this.getRegionInfo(), key, val)) {
              // if bypass this log entry, ignore it ...
          currentEditSeqId = key.getLogSeqNum();
If coprocessor skip some tail WALEdits, then the function will return smaller currentEditSeqId.
In the end, a region may also open with a smaller sequence number. This may cause data loss
because Master may record a larger flushed sequence Id and some WALEdits maybe skipped during
recovery if the region fail again.

This message was sent by Atlassian JIRA

View raw message