For background have a read of http://wiki.apache.org/cassandra/HintedHandoff

As the doc (the one above and Martin :) ) says, CL ONE, QUORUM and ALL only count writes to nodes that are responsible for the key. Then HH is used to eventually deliver that write to any nodes that were not available. 

CL.ANY is a lot less consistent and will ack when only a HH is recorded. 

AFAIK you are right that upping the RF to 5 will mean you can lose two nodes *responsible for the key* and still run a QUORUM write. 

Aaron

On 14 Sep, 2010,at 11:36 PM, Chris Jansen <chris.jansen@cognitomobile.com> wrote:

Thank you Martin, this has cleared things up for me. I thought that a replica would always be stored on the node I was connecting to, which makes sense as to why the load on each node is equally balanced.

 

So I could sustain quorum with two node failures if I have a RF=5 or greater.

 

Thanks again.

 

Chris

 

From: Dr. Martin GrabmŁller [mailto:Martin.Grabmueller@eleven.de]
Sent: 14 September 2010 09:54
To: user@cassandra.apache.org
Subject: RE: UnavailableException with 3 nodes and RF=2

 

When you write with QUORUM, RF/2+1 of the nodes cassandra *wants to write*

to have to be up.  In your case, RF/2+1 = 2, that means, the two nodes responsible

for the write have to be up, not any two nodes.  Each write which tries to the node

with token 78502309573904554351249603414557542595  and another node

will fail.

 

QUORUM consistency only gives you more availability when you have a RF of 3 or higher.

 

Martin


From: Chris Jansen [mailto:chris.jansen@cognitomobile.com]
Sent: Tuesday, September 14, 2010 10:44 AM
To: user@cassandra.apache.org
Subject: UnavailableException with 3 nodes and RF=2

Hi All,

 

Iím a newbie to Cassandra so I could have a configuration issue here, I am using the latest stable release 0.6.0.

 

I have created a cluster of 3 nodes, a keyspace with RF=2 and a rack unaware replication strategy. When I write with CL=QUORUM with all 3 nodes commit the data fine, but when I write with the same CL with one of the nodes down I see an UnavailableException thrown. Surely if one of the nodes in the cluster is down another should acknowledge the writes and maintain the quorum, or is there something that I have misunderstood? From what I understand, in this case with a RF=2 for the quorum writes to succeed I need two nodes to acknowledge the write (RF/2+1), which I have.

 

Here is how the cluster looks when quorum writes succeed:

 

192.168.245.2 Up         477.33 KB     78502309573904554351249603414557542595     |<--|

192.168.245.4 Up         426.74 KB     139625953069891725539207365034742863768    |   |

192.168.245.1 Up         496.67 KB     163572901304139170217093255272499595459    |-->|

 

This is how it looks with one node down and quorum writes fail (I am writing to 192.168.245.1):

 

192.168.245.2 Down       423.58 KB     78502309573904554351249603414557542595     |<--|

192.168.245.4 Up         426.74 KB     139625953069891725539207365034742863768    |   |

192.168.245.1 Up         496.67 KB     163572901304139170217093255272499595459    |-->|

 

Here is the exception that is thrown:

 

Cannot write: 9e48b039-7687-4b14-9b40-0096b15fd7b0 RETRYING

UnavailableException()

                at orgapache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12303)

                at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:675)

                at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:648)

                at cassandraclient.Main.writeReadDelete(Main.java:101)

                at cassandraclient.Main.run(Main.java:188)

                at java.lang.Thread.run(Thread.java:619)

 

If I switch CL=ONE the writes succeed, but I donít know if the data is being replicated.

 

Any help would be greatly appreciated, thanks.

 

Chris Jansen




NOTICE: Cognito Limited Benham Valence, Newbury, Berkshire, RG20 8LU. UK. Company number 02723032. This e-mail message and any attachment is confidential. It may not be disclosed to or used by anyone other than the intended recipient. If you have received this e-mail in error please notify the sender immediately then delete it from your system. Whilst every effort has been made to check this mail is virus free we accept no responsibility for software viruses and you should check for viruses before opening any attachments. Opinions, conclusions and other information in this email and any attachments which do not relate to the official business of the company are neither given by the company nor endorsed by it.

This email message has been scanned for viruses by Mimecast