zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2994) Tool required to recover log and snapshot entries with CRC errors
Date Tue, 24 Apr 2018 14:34:01 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16449968#comment-16449968
] 

ASF GitHub Bot commented on ZOOKEEPER-2994:
-------------------------------------------

GitHub user anmolnar opened a pull request:

    https://github.com/apache/zookeeper/pull/508

    ZOOKEEPER-2994 Tool required to recover log and snapshot entries with CRC errors (3.4)

    This is the 3.4 version of https://github.com/apache/zookeeper/pull/487
    @phunt I've just realized that the patch must introduce a new dependency: commons-cli.
    Not sure if you're willing to merge it in this case.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/anmolnar/zookeeper ZOOKEEPER-2994_34

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zookeeper/pull/508.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #508
    
----
commit 3bc2e5f7257ae23ccce1ff72a83510322efe716e
Author: Andor Molnar <andor@...>
Date:   2018-04-23T22:20:26Z

    ZOOKEEPER-2994: Tool required to recover log and snapshot entries with CRC errors
    
    https://issues.apache.org/jira/browse/ZOOKEEPER-2994
    
    In the event  of ZooKeeper transaction log becomes corrupted and fail CRC checks (preventing
startup) we should have a mechanism to get the cluster running again.
    
    Previously we achieved this by loading the broken transaction log with a modified version
of ZK with disabled CRC check and forced it to write new txn log files.
    
    It has proven that once you end up with the corrupt txn log there is no way to recover
except manually modifying the crc check. That's basically why the tool is needed.
    
    It's called TxnLogToolkit, a new console application similar to LogFormatter and SnapshotFormatter,
but it's intentionally separated to keep backward compatibility in the existing tools.
    
    This PR contains TXN log tool only.
    
    You probably also notice a refactoring to extract file padding logic from FileTxnLog to
reuse in the new tool. Related code changes can be reviewed alone in a separate commit if
preferred.
    
    Author: Andor Molnar <andor@cloudera.com>
    
    Reviewers: phunt@apache.org
    
    Closes #487 from anmolnar/ZOOKEEPER-2994 and squashes the following commits:
    
    221760ccc [Andor Molnar] ZOOKEEPER-2994. Added documentation and startup scripts
    a69d7297b [Andor Molnar] ZOOKEEPER-2994. Fix findbugs warning
    0b95efefd [Andor Molnar] ZOOKEEPER-2994. Fix for unit test
    15fa45c68 [Andor Molnar] ZOOKEEPER-2994. Added padding, tool renamed to TxnLogToolkit,
interactive mode, etc.
    6a1ad0ec4 [Andor Molnar] ZOOKEEPER-2994. Refactor FileTxnLog's padding logic to separate
class for reusability
    0d089ccdd [Andor Molnar] ZOOKEEPER-2994. Added new tool TxnLogTool for txn log file recovery
    
    Change-Id: I7560362633a7bc919ae6d3ca7e3588e196a1919c

----


> Tool required to recover log and snapshot entries with CRC errors
> -----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2994
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2994
>             Project: ZooKeeper
>          Issue Type: New Feature
>    Affects Versions: 3.5.4, 3.6.0
>            Reporter: Andor Molnar
>            Assignee: Andor Molnar
>            Priority: Major
>             Fix For: 3.5.4, 3.6.0
>
>
> In the even that the zookeeper transaction log or snapshot become corrupted and fail
CRC checks (preventing startup) we should have a mechanism to get the cluster running again.
> Previously we achieved this by loading the broken transaction log with a modified version
of ZK with disabled CRC check and forced it to snapshot.
> It'd very handy to have a tool which can do this for us. LogFormatter and SnapshotFormatter
have already been designed to dump log and snapshot files, it'd be nice to extend their functionality
and add ability for such recovery.
> It has proven that once you end up with the corrupt txn log there is no way to recover
except manually modifying the crc check. That's basically why the tool is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message