hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17407) Correct update of maxFlushedSeqId in HRegion
Date Wed, 04 Jan 2017 18:08:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15798918#comment-15798918
] 

stack commented on HBASE-17407:
-------------------------------

Can I help?

As per [~Apache9], onlyIfGreater seems odd.

Some backgound if it will help:

The  map on the regionserver of region+family to lowest sequence id is used in two places.
Locally for figuring when it is safe to let go of a regionservers WALs and then, remotely
as on every heartbeat to the master, we send it our inmemory map. The master then keeps up
a running accounting of the oldest sequenceid/edit in memstore for each familiy of each region.
It comes in handy at crash time when replaying WALs. If we see an edit for a family that has
a sequenceid that is less than what we have in our inmemory map for a particular region+family,
then we skip replaying the edit since we know it already persisted (out in an hfile). If lowest
sequenceid goes backwards, thats not the end of the world; we will over-replay. If the sequenceid
accounting advances when it should not, we'll skip the replay of edits that should have been
replayed (dataloss).

Our [~Apache9] knows this area of the code because he added finesse backfilling accounting
by column family (where previous we only did accounting at the coarser region granularity).
Finding bugs in implementation is awkward given the number of moving pieces and the offset
between writing and recovery.

> Correct update of maxFlushedSeqId in HRegion
> --------------------------------------------
>
>                 Key: HBASE-17407
>                 URL: https://issues.apache.org/jira/browse/HBASE-17407
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Eshcar Hillel
>
> The attribute maxFlushedSeqId in HRegion is used to track the max sequence id in the
store files and is reported to HMaster. When flushing only part of the memstore content this
value might be incorrect and may cause data loss.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message