incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: UnavailableException with 3 nodes and RF=2
Date Tue, 14 Sep 2010 08:54:50 GMT
On Tue, Sep 14, 2010 at 10:43 AM, Chris Jansen
<chris.jansen@cognitomobile.com> wrote:
> Hi All,
>
>
>
> I’m a newbie to Cassandra so I could have a configuration issue here, I am
> using the latest stable release 0.6.0.
>
>
>
> I have created a cluster of 3 nodes, a keyspace with RF=2 and a rack unaware
> replication strategy. When I write with CL=QUORUM with all 3 nodes commit
> the data fine, but when I write with the same CL with one of the nodes down
> I see an UnavailableException thrown. Surely if one of the nodes in the
> cluster is down another should acknowledge the writes and maintain the
> quorum, or is there something that I have misunderstood? From what I
> understand, in this case with a RF=2 for the quorum writes to succeed I need
> two nodes to acknowledge the write (RF/2+1), which I have.

RF=2 means that each row is replicated on 2 of your nodes. As you said,
Quorum is then 2. This means that for a quorum operation to succeed, you
need that the 2 nodes out of the 2 that holds the row (*not* 2 out of
all the nodes)
be alive. To say it otherwise, if *any* of your node is dead, some
operation will
fail with unavailable exception. That is, quorum support a node being down only
starting at RF=3.

>
>
>
> Here is how the cluster looks when quorum writes succeed:
>
>
>
> 192.168.245.2 Up         477.33 KB
> 78502309573904554351249603414557542595     |<--|
>
> 192.168.245.4 Up         426.74 KB
> 139625953069891725539207365034742863768    |   |
>
> 192.168.245.1 Up         496.67 KB
> 163572901304139170217093255272499595459    |-->|
>
>
>
> This is how it looks with one node down and quorum writes fail (I am writing
> to 192.168.245.1):
>
>
>
> 192.168.245.2 Down       423.58 KB
>  78502309573904554351249603414557542595     |<--|
>
> 192.168.245.4 Up         426.74 KB
> 139625953069891725539207365034742863768    |   |
>
> 192.168.245.1 Up         496.67 KB
> 163572901304139170217093255272499595459    |-->|
>
>
>
> Here is the exception that is thrown:
>
>
>
> Cannot write: 9e48b039-7687-4b14-9b40-0096b15fd7b0 RETRYING
>
> UnavailableException()
>
>                 at
> org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:12303)
>
>                 at
> org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:675)
>
>                 at
> org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:648)
>
>                 at cassandraclient.Main.writeReadDelete(Main.java:101)
>
>                 at cassandraclient.Main.run(Main.java:188)
>
>                 at java.lang.Thread.run(Thread.java:619)
>
>
>
> If I switch CL=ONE the writes succeed, but I don’t know if the data is being
> replicated.

Whatever the consistency level you use for a write, the data is always
replicated
unless some error occurs. The difference being whether the write waits to see if
an error occurs or not.

--
Sylvain

>
>
>
> Any help would be greatly appreciated, thanks.
>
>
>
> Chris Jansen
>
>
> NOTICE: Cognito Limited. Benham Valence, Newbury, Berkshire, RG20 8LU. UK.
> Company number 02723032. This e-mail message and any attachment is
> confidential. It may not be disclosed to or used by anyone other than the
> intended recipient. If you have received this e-mail in error please notify
> the sender immediately then delete it from your system. Whilst every effort
> has been made to check this mail is virus free we accept no responsibility
> for software viruses and you should check for viruses before opening any
> attachments. Opinions, conclusions and other information in this email and
> any attachments which do not relate to the official business of the company
> are neither given by the company nor endorsed by it.
>
> This email message has been scanned for viruses by Mimecast

Mime
View raw message