zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Reed (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-713) zookeeper fails to start - broken snapshot?
Date Thu, 18 Mar 2010 17:40:27 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12847013#action_12847013

Benjamin Reed commented on ZOOKEEPER-713:

lukasz thank you for reporting this problem and providing the details. we analyzed your logs
and found the following problems:

1) you are getting an OutOfMemoryError during the snapshot. that is why you are getting the
invalid snapshot file. it turns out that the invalid file isn't really a problem since we
can just use an older snapshot to recover from, but this may indicate that you are running
very close to the limit and spending a lot of time in the GC. (this may aggravate the next

2) the initLimit and ticktime in zoo.cfg may too low. how much data is stored in zookeeper?
look at the snapshot filesize, you need to be able to transmit the snapshot within the initLimit*ticktime.
as an experiment try scping the snapshots between the different servers and see how long it
takes. ticktime should be increased on wan connections. you might try doubling the ticktime
and initLimit. (if you really are overloading the GC it is going to be slowing things down.)

3) we also noticed that it is taking a long time to read and write snapshots (~17 and ~40
seconds in some cases). do you have other things contending with the disk? this is going to
affect how long it takes for the leader to respond to a client, and thus the initLimit.

4) we noticed that the jvm version you are using is pretty old. you may try upgrading to the
latest version, especially since you are using 64-bit linux.

5) finally, it appears you are running under xen. some people have had issues running in virtualized
environments. the main problems being memory and disk contention. if java ever starts swapping,
the GC can hang the process while pages go on an off the disk for reference checking. disk
contention will also slow down recovery and logging to disk during write operations.

we are opening a jira, ZOOKEEPER-714, related to 1) since we really should exit on OutOfMemoryErrors.
in your specific case we recovered, but in general it doesn't seem wise to continue execution
once you have gotten an OutOfMemoryError.

> zookeeper fails to start - broken snapshot?
> -------------------------------------------
>                 Key: ZOOKEEPER-713
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-713
>             Project: Zookeeper
>          Issue Type: Bug
>    Affects Versions: 3.2.2
>         Environment: debian lenny; ia64; xen virtualization
>            Reporter: Lukasz Osipiuk
>         Attachments: node1-version-2.tgz-aa, node1-version-2.tgz-ab, node1-zookeeper.log.gz,
node2-version-2.tgz-aa, node2-version-2.tgz-ab, node2-version-2.tgz-ac, node2-zookeeper.log.gz,
node3-version-2.tgz-aa, node3-version-2.tgz-ab, node3-version-2.tgz-ac, node3-zookeeper.log.gz,
> Hi guys,
> The following is not a bug report but rather a question - but as I am attaching large
files I am posting it here rather than on mailinglist.
> Today we had major failure in our production environment. Machines in zookeeper cluster
gone wild and all clients got disconnected.
> We tried to restart whole zookeeper cluster but cluster got stuck in leader election
> Calling stat command on any machine in the cluster resulted in 'ZooKeeperServer not running'
> In one of logs I noticed 'Invalid snapshot'  message which disturbed me a bit.
> We did not manage to make cluster work again with data. We deleted all version-2 directories
on all nodes and then cluster started up without problems.
> Is it possible that snapshot/log data got corrupted in a way which made cluster unable
to start?
> Fortunately we could rebuild data we store in zookeeper as we use it only for locks and
most of nodes is ephemeral.
> I am attaching contents of version-2 directory from all nodes and server logs.
> Source problem occurred some time before 15. First cluster restart happened at 15:03.
> At some point later we experimented with deleting version-2 directory so I would not
look at following restart because they can be misleading due to our actions.
> I am also attaching zoo.cfg. Maybe something is wrong at this place. 
> As I know look into logs i see read timeout during initialization phase after 20secs
(initLimit=10, tickTime=2000).
> Maybe all I have to do is increase one or other. which one? Are there any downsides of
increasing tickTime.
> Best regards, Ɓukasz Osipiuk
> PS. due to attachment size limit I used split. to untar use 
> cat nodeX-version-2.tgz-* |tar -xz

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message