cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rubbish me <rubbish...@googlemail.com>
Subject commit log to disk with periodic mode
Date Thu, 23 Aug 2012 21:59:29 GMT
Hi all

First off, please let me introduce the setup.

----
- a balance ring of 6 x C* 1.1.2 in active DC (DC1), 6 in another (DC2); 
- keyspace's RF=3 in each DC;
- client talks only to DC1 unless DC1 can't serve the request, in which case talks only to
DC2;
- commit log is being sync periodically with the default setting of 10s.
- consistency policy = LOCAL QUORUM for both read and write.
- we are running on production linux VMs (not ideal but this is out of our hands)
-----

As part of a DR exercise, we brutally killed all 6 nodes in DC1, client started talking to
DC2. All data survived, everything continued to work perfectly.

Then we brought all nodes in DC1 up, one by one We saw each with message saying commit logs
were all replayed. No errors reported.  We didn't run repair at this time.

However, DC1 lost data that was written an hour before the DR exercise.  It seemed everything
after the last memtable-flush was gone.

If we understand correctly, commit logs are being written first and then sync to disk every
10s. At worst we would have lost the last 10s of data. 
But it seemed as if the periodic sync didnt happen.  What could be the cause of this behaviour?

With the blessing of C* we could recovered all these data from DC2. But we would like to understand
the possible cause.

Many thanks in advanced.

- A
Mime
View raw message