zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asad Saeed <ASa...@scalecomputing.com>
Subject Re: Inconsistent data across 3.4.6 ensemble
Date Thu, 23 Apr 2015 01:43:05 GMT

Can you give more detail as in how the data is inconsistent and post the logs somewhere. Are
the log and data directories on different mountpoints?

To recover immediately, you should stop zookeeper on the divergent nodes. Backup then delete
the log and snap directories on those nodes and then restart zookeeper on those nodes.


From: jlindwall <jlindwall@yahoo.com>
Sent: Apr 22, 2015 6:25 PM
To: zookeeper-user@hadoop.apache.org
Subject: Inconsistent data across 3.4.6 ensemble

We somehow are seeing inconsistent data across our 3-node prod ensemble.
Never saw anything like it in dev or qa. We are running on Solaris.

The dataDirs for the nodes were recently involved in a situation in which
the nfs disk they live on was dismounted and remounted, while zk was
running. Not sure if it is related.

Regardless, this seems like it should never happe

n with zookeeper.

Any ideas for correcting the situation?  I have 2 ideas, please critique:

1. Bring down follower 1, delete it's logDataDir and dataDir contents,
restart; do same with follower 2
2. Bring down the whole thing; delete all logDataDir and dataDir contents;

I'd prefer not to do option #2, but I will if I must.


View this message in context: http://zookeeper-user.578899.n2.nabble.com/Inconsistent-data-across-3-4-6-ensemble-tp7581007.html
Sent from the zookeeper-user mailing list archive at Nabble.com.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message