Yes. 

You probably shouldn't ever be using CL.ANY (though I'm certain there are others that disagree with me; I wish them the best of luck with that).

CL.ONE + periodic sync can potentially lose recently written data, but if you care about that then you better care enough about your data to use something greater than CL.ONE.  With CL.ONE + periodic: If your disk dies you lose data.  If your OS crashes on a node you lose data.  If the processor melts you lose data.  If your memory goes bad you lose data.  If your UPS is interrupted you lose data.  if X (for many values of X) you lose data.  Not that for some situations (e.g. disk failure) it doesn't matter what the commit log sync (batch v periodic) is set to, you lose data.

If your C* process dies and/or is killed you should not lose data.  It's written to the commit log before the client is acked, however that entry may not have made it to disk yet in the case of commitlogsync=periodic.  So, if you kill the C* process you're fine.  If you nicely restart the OS, you should be fine (assuming your boxen/raid controllers/disks/etc do the sane thing).  If you nuke your OS, then see above about losing data on CL.ONE.


On Thu, Oct 7, 2010 at 7:11 PM, David McIntosh <david@radiotime.com> wrote:

Are there any data loss concerns if you have the commit log sync set to periodic and are writing with CL One or Any?

 

From: Matthew Dennis [mailto:mdennis@riptano.com]
Sent: Wednesday, October 06, 2010 8:53 PM
To: user@cassandra.apache.org
Subject: Re: Newbie Question about restarting Cassandra

 

Rob is correct.

drain is really on there for when you need the commit log to be empty (some upgrades or a complete backup of a shutdown cluster).

There really is no point to using to shutdown C* normally, just kill it...

On Wed, Oct 6, 2010 at 4:18 PM, Rob Coli <rcoli@digg.com> wrote:

On 10/6/10 1:13 PM, Aaron Morton wrote:

To shutdown cleanly, say in a production system, use nodetool drain
first. This will flush the memtables and put the node into a read only
mode, AFAIK this also gives the other nodes a faster way of detecting
the node is down via the drained node gossiping it's new status. Then kill.

 

FWIW, the gossiper related code for "drain" (trunk) looks like it just stops the gossip service, which is almost certainly the same thing that happens if you kill Cassandra.

./src/java/org/apache/cassandra/service/StorageService.java
"
   public synchronized void drain() throws IOException, InterruptedException, ExecutionException
...
 setMode("Starting drain process", true);
       Gossiper.instance.stop();
"

./src/java/org/apache/cassandra/gms/Gossiper.java
"
 public void stop()
   {
       scheduledGossipTask.cancel(false);
   }
"

=Rob




--
Riptano
Software and Support for Apache Cassandra
http://www.riptano.com/
mdennis@riptano.com
m: 512.587.0900 f: 866.583.2068