hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16721) Concurrency issue in WAL unflushed seqId tracking
Date Wed, 28 Sep 2016 01:04:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15527972#comment-15527972
] 

Enis Soztutar commented on HBASE-16721:
---------------------------------------

In branch-1 and earlier, FSHLog.findRegionsToForceFlush() is the method which finds that region
to be force-flushed. This means that the region is somehow still in the oldestUnflushedStoreSequenceIds,
although the region is closed already and flushed.

I suspect the issue is a race between FSHLog.startCacheFlush() and FSHLog.append() racing
for the updates to the oldestUnflushedStoreSequenceIdsOfRegion map. 

FSHlog.append() is called from the ring buffer handler thread, and does this: 
{code}
    private void updateOldestUnflushedSequenceIds(byte[] encodedRegionName,
        Set<byte[]> familyNameSet, Long lRegionSequenceId) {
      ConcurrentMap<byte[], Long> oldestUnflushedStoreSequenceIdsOfRegion =
          getOrCreateOldestUnflushedStoreSequenceIdsOfRegion(encodedRegionName);
      for (byte[] familyName : familyNameSet) {
        oldestUnflushedStoreSequenceIdsOfRegion.putIfAbsent(familyName, lRegionSequenceId);
      }
    }
{code}

And FSHLog.startCacheFlush() does this: 
{code}
synchronized (regionSequenceIdLock) {
      ConcurrentMap<byte[], Long> oldestUnflushedStoreSequenceIdsOfRegion =
          oldestUnflushedStoreSequenceIds.get(encodedRegionName);
      if (oldestUnflushedStoreSequenceIdsOfRegion != null) {
        for (byte[] familyName: flushedFamilyNames) {
          Long seqId = oldestUnflushedStoreSequenceIdsOfRegion.remove(familyName);
          if (seqId != null) {
            oldStoreSeqNum.put(familyName, seqId);
          }
        }
        if (!oldStoreSeqNum.isEmpty()) {
          Map<byte[], Long> oldValue = this.lowestFlushingStoreSequenceIds.put(
              encodedRegionName, oldStoreSeqNum);
          assert oldValue == null: "Flushing map not cleaned up for "
              + Bytes.toString(encodedRegionName);
        }
        if (oldestUnflushedStoreSequenceIdsOfRegion.isEmpty()) {
          // Remove it otherwise it will be in oldestUnflushedStoreSequenceIds for ever
          // even if the region is already moved to other server.
          // Do not worry about data racing, we held write lock of region when calling
          // startCacheFlush, so no one can add value to the map we removed.
          oldestUnflushedStoreSequenceIds.remove(encodedRegionName);
        } else {
          oldestUnflushedSequenceId =
              Collections.min(oldestUnflushedStoreSequenceIdsOfRegion.values());
        }
      }
{code}

The actual problem is that the sub-map for the region which keeps the per-store unflushed
edits gets removes from:
{code}
          Long seqId = oldestUnflushedStoreSequenceIdsOfRegion.remove(familyName);
{code}
but the actual region entry is removed some time later in: 
{code}
        if (oldestUnflushedStoreSequenceIdsOfRegion.isEmpty()) {
          oldestUnflushedStoreSequenceIds.remove(encodedRegionName);
{code}

However, this comment is not correct: 
{code}
          // Do not worry about data racing, we held write lock of region when calling
          // startCacheFlush, so no one can add value to the map we removed.
{code}
and it is likely that an append() request is still executing and will re-add the removed family
to the map. Although startCacheFlush() happens under the HRegion.updatesLock().writeLock()
and doMiniBatchMutation() acquires the read lock, we may give away the read lock before the
ring buffer handler actually does the FSHLog.append(). We release the region lock in step
6, but only wait for the seqId assignment in step 8. 


> Concurrency issue in WAL unflushed seqId tracking
> -------------------------------------------------
>
>                 Key: HBASE-16721
>                 URL: https://issues.apache.org/jira/browse/HBASE-16721
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4
>
>
> I'm inspecting an interesting case where in a production cluster, some regionservers
ends up accumulating hundreds of WAL files, even with force flushes going on due to max logs.
This happened multiple times on the cluster, but not on other clusters. The cluster has periodic
memstore flusher disabled, however, this still does not explain why the force flush of regions
due to max limit is not working. I think the periodic memstore flusher just masks the underlying
problem, which is why we do not see this in other clusters. 
> The problem starts like this: 
> {code}
> 2016-09-21 17:49:18,272 INFO  [regionserver//10.2.0.55:16020.logRoller] wal.FSHLog: Too
many wals: logs=33, maxlogs=32; forcing flush of 1 regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-21 17:49:18,273 WARN  [regionserver//10.2.0.55:16020.logRoller] regionserver.LogRoller:
Failed to schedule flush of d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> then, it continues until the RS is restarted: 
> {code}
> 2016-09-23 17:43:49,356 INFO  [regionserver//10.2.0.55:16020.logRoller] wal.FSHLog: Too
many wals: logs=721, maxlogs=32; forcing flush of 1 regions(s): d4cf39dc40ea79f5da4d0cf66d03cb1f
> 2016-09-23 17:43:49,357 WARN  [regionserver//10.2.0.55:16020.logRoller] regionserver.LogRoller:
Failed to schedule flush of d4cf39dc40ea79f5da4d0cf66d03cb1f, region=null, requester=null
> {code}
> The problem is that region {{d4cf39dc40ea79f5da4d0cf66d03cb1f}} is already split some
time ago, and was able to flush its data and split without any problems. However, the FSHLog
still thinks that there is some unflushed data for this region. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message