hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Duo Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14949) Resolve name conflict when splitting if there are duplicated WAL entries
Date Wed, 17 Feb 2016 06:30:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15149982#comment-15149982
] 

Duo Zhang commented on HBASE-14949:
-----------------------------------

{quote}
We use to have an isCreate flag. We don't have it anymore. Was it always true? (It looks like
it going by your patch).
{quote}
Yes, it is always true and only called from WALSplitter. I think we could change it to private.

{quote}
Should you change formatRecoveredEditsFileName to take the original file name? It looks like
it is called from one other place at least.
{quote}
No, I just append the file name after the result of formatRecoveredEditsFileName...
{code}
String fileName = formatRecoveredEditsFileName(logEntry.getKey().getSequenceId());
fileName = getTmpRecoveredEditsFileName(fileName + "-" + fileBeingSplit.getPath().getName());
return new Path(dir, fileName);
{code}

{quote}
So, we write with the name of the WAL in the split file name. Where do we read it back? (I'm
asking you because you probably have your finger on it). I want to see if we handle case of
bare sequenceid as well as this new format. In fact, should we have a test that demonstrates
this?
{quote}
Sorry I do not get the point... We only change the intermediate tmp file name, it will be
renamed when split end. And for the final recovered edits file name conflict, the old logic
just delete the old one, and for our new logic, we need to delete the one with fewer entries...

Thanks.

> Resolve name conflict when splitting if there are duplicated WAL entries
> ------------------------------------------------------------------------
>
>                 Key: HBASE-14949
>                 URL: https://issues.apache.org/jira/browse/HBASE-14949
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Heng Chen
>            Assignee: Duo Zhang
>         Attachments: HBASE-14949-v3.patch, HBASE-14949-v4.patch, HBASE-14949.patch, HBASE-14949_v1.patch,
HBASE-14949_v2.patch
>
>
> The AsyncFSHLog introduced in HBASE-14790 may write same WAL entries to different WAL
files. WAL entry itself is idempotent so replay is not a problem but the intermediate file
name and final name when splitting is constructed using the lowest or highest sequence id
of the WAL entries written, so it is possible that different WAL files will have same intermediate
or final file name when splitting. In the currentm implementation, this will cause split fail
or data loss. We need to solve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message