hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heng Chen (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-14949) Skip duplicate entries when replay WAL.
Date Thu, 10 Dec 2015 07:53:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050250#comment-15050250
] 

Heng Chen edited comment on HBASE-14949 at 12/10/15 7:52 AM:
-------------------------------------------------------------

I check current logic and found that we need to do nothing......

It has already skip the duplicate entries during split WAL into recovery region edits.   
And WAL named by timestamp when it is generated, so there is no need to use another format
name.

relates code 

{code: title=WALSplitter#splitLogFile}
352  if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
353          editsSkipped++;
354          continue;
355  }
{code}

I think we can invalid this issue. 


UPDATE:

Sorry,  the lastFlushedSequenceId is flushed id from HFile,  Maybe we could do something to
skip duplicate entries in WAL through the same way


was (Author: chenheng):
I check current logic and found that we need to do nothing......

It has already skip the duplicate entries during split WAL into recovery region edits.   
And WAL named by timestamp when it is generated, so there is no need to use another format
name.

relates code 

{code: title=WALSplitter#splitLogFile}
352  if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) {
353          editsSkipped++;
354          continue;
355  }
{code}

I think we can invalid this issue. 


UPDATE:

Sorry,  the lastFlushedSequenceId is flushed id from HFile,  Maybe we could do something to
skip duplicate entries in WAL

> Skip duplicate entries when replay WAL.
> ---------------------------------------
>
>                 Key: HBASE-14949
>                 URL: https://issues.apache.org/jira/browse/HBASE-14949
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Heng Chen
>         Attachments: HBASE-14949.patch
>
>
> As HBASE-14004 design,  there will be duplicate entries in different WAL.  It happens
when one hflush failed, we will close old WAL with 'acked hflushed' length,  then open a new
WAL and write the unacked hlushed entries into it.
> So there maybe some overlap between old WAL and new WAL.
> We should skip the duplicate entries when replay.  I think it has no harm to current
logic, maybe we do it first. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message