hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16994) Region report a last flushed sequence id that is less than the previous last flushed sequence id
Date Wed, 02 Nov 2016 14:01:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629060#comment-15629060
] 

Yu Li commented on HBASE-16994:
-------------------------------

Supplement about the background: we've observed lots of below warnings in HMaster log in our
production env and above is what we found after investigation.
{noformat}
2016-09-07 21:17:09,559 WARN  [PriorityRpcServer.handler=14,queue=0,port=60100] master.ServerManager:
RegionServer
hadoop0676.et2.tbsite.net,16020,1472107731858 indicates a last flushed sequence id (26683793)
that is less than the
previous last flushed sequence id (26683796) for region main_result_b,1879,1465227739374.a5b18fc39144b7333dec8bad22d56f11.
Ignoring.
{noformat}


> Region report a last flushed sequence id that is less than the previous last flushed
sequence id 
> -------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16994
>                 URL: https://issues.apache.org/jira/browse/HBASE-16994
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>         Attachments: HBASE-16994_master_v1.patch, HBASE-16994_master_v2.patch
>
>
> Since append will be published to RingBuffer and handled asynchronously, it's possible
that one append (say append-X) of the region handled by RingBufferEventHandler between startCacheFlush
and getNextSequenceId, and reset FSHLog#oldestUnflushedStoreSequenceIds which we just cleared
in #startCacheFlush. This might disturb ServerManager#flushedSequenceIdByRegion like shown
below (assume region-A has two CF: cfA and cfB)
>    
> 1. flush-A runs to startCacheFlush and it will flush both cfA and cfB, oldestUnflushedStoreSequenceIds
of regionA got cleared
>  2. append-X on cfB handled by RingBufferEventHandler, oldestUnflushedStoreSequenceIds
set to 10, for example
>  3. flush-A runs to getNextSequenceId and returned 11
>  4. ServerManager#flushedSequenceIdByRegion for regionA set to 11
>  5. flush-A finishes
>  6. flush-B starts and only flush cfA, getNextSequenceId returned 10, and flushedSeqId
will return 9, and cause warning in ServerManager
> Since this append-X will also got flushed, we should clear the oldestUnflushedStoreSequenceIds
again to make sure we won't disturb
>  ServerManager#flushedSequenceIdByRegion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message