hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
Date Thu, 25 Aug 2011 04:24:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090756#comment-13090756
] 

ramkrishna.s.vasudevan commented on HBASE-3845:
-----------------------------------------------

{code}
+      if (wal != null) 
+        wal.abortCacheFlush(this.regionInfo.getEncodedNameAsBytes());
{code}
Pls uses braces as there is a second line.


> data loss because lastSeqWritten can miss memstore edits
> --------------------------------------------------------
>
>                 Key: HBASE-3845
>                 URL: https://issues.apache.org/jira/browse/HBASE-3845
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.3
>            Reporter: Prakash Khemani
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.92.0
>
>         Attachments: 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch,
HBASE-3845-fix-TestResettingCounters-test.txt, HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch,
HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, HBASE-3845_branch90V1.patch,
HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch
>
>
> (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally
and wanted to put this up for some feedback.)
> In this discussion let us assume that the region has only one column family. That way
I can use region/memstore interchangeably.
> After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for
a region that is not the earliest log-sequence-id for that region's memstore.
> HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we
only keep track  of the earliest log-sequence-number that is present in the memstore.
> Every time the memstore is flushed we remove the region's entry in lastSequenceWritten
and wait for the next append to populate this entry again. This is where the problem happens.
> step 1:
> flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock().
> step 2 :
> as soon as the updatesLock.writeLock() is released new entries will be added into the
memstore.
> step 3 :
> wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten.
> step 4:
> the next append will create a new entry for the region in lastSeqWritten(). But this
will be the log seq id of the current append. All the edits that were added in step 2 are
missing.
> ==
> as a temporary measure, instead of removing the region's entry in step 3 I will replace
it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message