hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13811) Splitting WALs, we are filtering out too many edits -> DATALOSS
Date Fri, 05 Jun 2015 01:12:38 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573900#comment-14573900

Duo Zhang commented on HBASE-13811:

Rather than add a new method that does what the old getEarliestMemstoreSeqNum did, I changed
getEarliestMemstoreSeqNum to be how the old version worked.
Fine, I think it will work. But I still feel a little nervous to have two methods which have
same name but different behaviors...

And I remember that, when implmenting HBASE-10201 and HBASE-12405, actually I wanted to return
the flushedSeqId when calling startCacheFlush first. But there are two problems. First is
getNextSequenceId method is in HRegion, not in FSHLog, so a simple solution is return NO_SEQ_NUM
when flushing all stores and let HRegion call getNextSequenceId. But here comes the second
problem, startCacheFlush may fail which means we can not start a flush, so there are three
types of return values, 'sequenceId', 'choose a sequenceId by yourself', 'give up flushing!'.
I think it is ugly to have a '-2' or a null java.lang.Long to indicate a 'give up flushing'
at that time so I gave up...

Maybe we could consider this solution again? getEarliestMemstoreSeqNum can be used everywhere
but startCacheFlush is restricted in the flushing scope I think.


> Splitting WALs, we are filtering out too many edits -> DATALOSS
> ---------------------------------------------------------------
>                 Key: HBASE-13811
>                 URL: https://issues.apache.org/jira/browse/HBASE-13811
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 2.0.0, 1.2.0
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>             Fix For: 2.0.0, 1.2.0
>         Attachments: 13811.branch-1.txt, 13811.branch-1.txt, 13811.txt, 13811.v2.branch-1.txt,
13811.v3.branch-1.txt, 13811.v3.branch-1.txt, 13811.v4.branch-1.txt, 13811.v5.branch-1.txt,
13811.v6.branch-1.txt, 13811.v6.branch-1.txt, HBASE-13811-v1.testcase.patch, HBASE-13811.testcase.patch
> I've been running ITBLLs against branch-1 around HBASE-13616 (move of ServerShutdownHandler
to pv2). I have come across an instance of dataloss. My patch for HBASE-13616 was in place
so can only think it the cause (but cannot see how). When we split the logs, we are skipping
legit edits. Digging.

This message was sent by Atlassian JIRA

View raw message