zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files
Date Tue, 13 Dec 2016 12:17:58 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15745011#comment-15745011
] 

ASF GitHub Bot commented on ZOOKEEPER-2574:
-------------------------------------------

Github user abhishekrai commented on a diff in the pull request:

    https://github.com/apache/zookeeper/pull/111#discussion_r92156972
  
    --- Diff: src/java/main/org/apache/zookeeper/server/PurgeTxnLog.java ---
    @@ -108,9 +141,11 @@ public boolean accept(File f){
             // remove the old files
             for(File f: files)
             {
    -            System.out.println("Removing file: "+
    +            final String msg = "Removing file: "+
                     DateFormat.getDateTimeInstance().format(f.lastModified())+
    -                "\t"+f.getPath());
    +                "\t"+f.getPath();
    +            LOG.info(msg);
    +            System.out.println(msg);
    --- End diff --
    
    It's not ideal but they both serve a purpose that the other cannot as far as I can tell.
    
    System.out.println is useful when this is invoked directly through CLI.  The old behavior
was to provide just this.
    LOG.info is useful in that it's visible in the server log.  The old behavior did not log
this info which makes debugging through server logs harder.


> PurgeTxnLog can inadvertently delete required txn log files
> -----------------------------------------------------------
>
>                 Key: ZOOKEEPER-2574
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2
>         Environment: Zookeeper 3.4.8, standalone, and 3-server quorum
>            Reporter: Abhishek Rai
>            Assignee: Abhishek Rai
>             Fix For: 3.4.10, 3.5.3
>
>         Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.3.patch, ZOOKEEPER-2574.4.patch,
ZOOKEEPER-2574.5.patch, ZOOKEEPER-2574.6.patch, ZOOKEEPER-2574.patch
>
>
> As part of the fix for ZOOKEEPER-1797, the call to FileTxnSnapLog.getSnapshotLogs() was
removed from PurgeTxnLog.java.  As a result, some old-looking but required txn log files can
be deleted, resulting in data corruption or loss.
> For example, consider the following:
> 1. Configuration:
> autopurge.snapRetainCount=3
> 2. Following files exist:
> log.100 spans transactions from zxid=100 till zxid=140 (inclusive)
> snapshot.110 - snapshot as of zxid=110
> snapshot.120 - snapshot as of zxid=120
> snapshot.130 - snapshot as of zxid=130
> Above scenario is possible when snapshotting has happened multiple times but without
accompanying log rollover, which is possible if the server was running as a learner.
> 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is older than
the zxid of the oldest snapshot (110).  This results in loss of transactions in the range
131-140.
> Before the fix for ZOOKEEPER-1797, this was avoided by the call to FileTxnSnapLog.getSnapshotLogs()
which finds and retains the newest txn log file with starting zxid < oldest retained snapshot's
highest zxid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message