Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 2109 invoked from network); 14 Sep 2010 08:45:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Sep 2010 08:45:03 -0000 Received: (qmail 45740 invoked by uid 500); 14 Sep 2010 08:45:02 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 45557 invoked by uid 500); 14 Sep 2010 08:44:58 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 45549 invoked by uid 99); 14 Sep 2010 08:44:57 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Sep 2010 08:44:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [213.235.63.77] (HELO service23.mimecast.com) (213.235.63.77) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 14 Sep 2010 08:44:32 +0000 Received: from exchange.cognito.co.uk (213.122.177.132 [213.122.177.132]) by service23.mimecast.com; Tue, 14 Sep 2010 09:43:47 +0100 Received: from [192.9.220.3] (helo=seamonkey.exchange.cognito.co.uk) by tupac.cognito.co.uk with esmtp (Exim 4.43) id 1OvR7B-000640-DL for user@cassandra.apache.org; Tue, 14 Sep 2010 09:43:45 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: UnavailableException with 3 nodes and RF=2 Date: Tue, 14 Sep 2010 09:43:39 +0100 Message-ID: <634087A402B83643BAD920A30B0F1D5504F0C158@seamonkey.exchange.cognito.co.uk> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: UnavailableException with 3 nodes and RF=2 Thread-Index: ActT5pJLBrhVY6dZRn+wtzDvlpk8Ag== From: "Chris Jansen" To: X-MC-Unique: 110091409434701001 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB53E8.ED25B414" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. ------_=_NextPart_001_01CB53E8.ED25B414 Content-Type: text/plain; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable Hi All, =20 I'm a newbie to Cassandra so I could have a configuration issue here, I am using the latest stable release 0.6.0. =20 I have created a cluster of 3 nodes, a keyspace with RF=3D2 and a rack unaware replication strategy. When I write with CL=3DQUORUM with all 3 nodes commit the data fine, but when I write with the same CL with one of the nodes down I see an UnavailableException thrown. Surely if one of the nodes in the cluster is down another should acknowledge the writes and maintain the quorum, or is there something that I have misunderstood? From what I understand, in this case with a RF=3D2 for the quorum writes to succeed I need two nodes to acknowledge the write (RF/2+1), which I have. =20 Here is how the cluster looks when quorum writes succeed: =20 192.168.245.2 Up 477.33 KB 78502309573904554351249603414557542595 |<--| 192.168.245.4 Up 426.74 KB 139625953069891725539207365034742863768 | | 192.168.245.1 Up 496.67 KB 163572901304139170217093255272499595459 |-->| =20 This is how it looks with one node down and quorum writes fail (I am writing to 192.168.245.1): =20 192.168.245.2 Down 423.58 KB 78502309573904554351249603414557542595 |<--| 192.168.245.4 Up 426.74 KB 139625953069891725539207365034742863768 | | 192.168.245.1 Up 496.67 KB 163572901304139170217093255272499595459 |-->| =20 Here is the exception that is thrown: =20 Cannot write: 9e48b039-7687-4b14-9b40-0096b15fd7b0 RETRYING UnavailableException() at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java: 12303) at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java: 675) at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:648) at cassandraclient.Main.writeReadDelete(Main.java:101) at cassandraclient.Main.run(Main.java:188) at java.lang.Thread.run(Thread.java:619) =20 If I switch CL=3DONE the writes succeed, but I don't know if the data is being replicated. =20 Any help would be greatly appreciated, thanks. =20 Chris Jansen NOTICE: Cognito Limited. Benham Valence, Newbury, Berkshire, RG20 8LU. UK.= Company number 02723032. This e-mail message and any attachment is confid= ential. It may not be disclosed to or used by anyone other than the intende= d recipient. If you have received this e-mail in error please notify the se= nder immediately then delete it from your system. Whilst every effort has b= een made to check this mail is virus free we accept no responsibility for s= oftware viruses and you should check for viruses before opening any attachm= ents. Opinions, conclusions and other information in this email and any att= achments which do not relate to the official business of the company are ne= ither given by the company nor endorsed by it. This email message has been scanned for viruses by Mimecast ------_=_NextPart_001_01CB53E8.ED25B414 Content-Type: text/html; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable =20

Hi All,

<= p class=3DMsoNormal> 

I’m a n= ewbie to Cassandra so I could have a configuration issue here, I am using t= he latest stable release 0.6.0.

&nb= sp;

I have created a cluster of 3 nodes, a ke= yspace with RF=3D2 and a rack unaware replication strategy. When I write wi= th CL=3DQUORUM with all 3 nodes commit the data fine, but when I write with= the same CL with one of the nodes down I see an UnavailableException throw= n. Surely if one of the nodes in the cluster is down another should acknowl= edge the writes and maintain the quorum, or is there something that I have = misunderstood? From what I understand, in this case with a RF=3D2 for the q= uorum writes to succeed I need two nodes to acknowledge the write (RF/2+1),= which I have.

 

Here is how the cluster looks when quorum writes succeed:<= o:p>

 

192.168.245.2 Up         477.33 K= B     78502309573904554351249603414557542595  = ;   |<--|

192.168.245.4 Up&= nbsp;        426.74 KB   =   139625953069891725539207365034742863768    | &nb= sp; |

192.168.245.1 Up   =       496.67 KB     1635729013= 04139170217093255272499595459    |-->|

 

This is how it l= ooks with one node down and quorum writes fail (I am writing to 192.168.245= .1):

 

192.168.245.2 Down       423.58 KB&nbs= p;    78502309573904554351249603414557542595  &nbs= p;  |<--|

192.168.245.4 Up =         426.74 KB    = ; 139625953069891725539207365034742863768    |   |=

192.168.245.1 Up    = ;     496.67 KB     163572901304139= 170217093255272499595459    |-->|

 

Here is the exceptio= n that is thrown:

 

<= p class=3DMsoNormal>Cannot write: 9e48b039-7687-4b14-9b40-0096b15fd7b0 RETR= YING

UnavailableException()

        &nbs= p;       at org.apache.cassandra.thrift.Cassa= ndra$insert_result.read(Cassandra.java:12303)

           &nb= sp;    at org.apache.cassandra.thrift.Cassandra$Client.recv_= insert(Cassandra.java:675)

  &= nbsp;           &nbs= p; at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:64= 8)

      &= nbsp;         at cassandraclient.Ma= in.writeReadDelete(Main.java:101)

 =             &nb= sp;  at cassandraclient.Main.run(Main.java:188)

          &n= bsp;     at java.lang.Thread.run(Thread.java:619)<= /o:p>

 

If = I switch CL=3DONE the writes succeed, but I don’t know if the data is= being replicated.

 

=

Any help would be greatly appreciated, thanks.

 

Chris= Jansen




=20 NOTICE: Cognito Limited. Benham Valence, Newbury, Berkshire, RG20 8LU. UK.= Company number 02723032. This e-mail message and any attachment is confide= ntial. It may not be disclosed to or used by anyone other than the intended= recipient. If you have received this e-mail in error please notify the sen= der immediately then delete it from your system. Whilst every effort has be= en made to check this mail is virus free we accept no responsibility for so= ftware viruses and you should check for viruses before opening any attachme= nts. Opinions, conclusions and other information in this email and any atta= chments which do not relate to the official business of the company are nei= ther given by the company nor endorsed by it.

=20 =20 This email message has been scanned for viruses by Mimecast

=20 ------_=_NextPart_001_01CB53E8.ED25B414--