Sorry, the bug was in our snitch. We‘re using getHostName() instead of getCanonicalHostName() to determine DC & Rack and since for local it returns alias, instead of reverse DNS, DC & Rack numbers are not as expected.

 

 

 

Best regards/ Pagarbiai

 

Viktor Jevdokimov

Senior Developer

 

Email:  Viktor.Jevdokimov@adform.com

Phone: +370 5 212 3063. Fax: +370 5 261 0453

J. Jasinskio 16C, LT-01112 Vilnius, Lithuania

 

 

Adform news

Visit us!

Follow:

twitter

Visit our blog

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.


From: Viktor Jevdokimov [mailto:Viktor.Jevdokimov@adform.com]
Sent: Thursday, December 01, 2011 14:05
To: user@cassandra.apache.org
Subject: NetworkTopologyStrategy bug?

 

Assume for now we have 1 DC and 1 rack with 3 nodes. Ring will look like:

(we use own snitch, which returns DC=0, Rack=0 for this case).

 

Address            DC                       Rack                   Token

                                                                                      113427455640312821154458202477256070484

10.0.0.1             0                          0                          0

10.0.0.2             0                          0                          56713727820156410577229101238628035242

10.0.0.3             0                          0                          113427455640312821154458202477256070484

 

Schema: ReplicaPlacementStrategy=NetworkTopologyStrategy, options: [0:2] (2 replicas in DC 0).

 

When trying to run cleanup (same problem with repair), Cassandra reports:

 

From 10.0.0.1:

DEBUG [time] 10.0.0.2,10.0.0.3 endpoints in datacenter 0 for token 0

DEBUG [time] 10.0.0.2,10.0.0.3 endpoints in datacenter 0 for token 56713727820156410577229101238628035242

DEBUG [time] 10.0.0.3,10.0.0.2 endpoints in datacenter 0 for token 113427455640312821154458202477256070484

INFO [time] Cleanup cannot run before a node has joined the ring

 

From 10.0.0.2:

DEBUG [time] 10.0.0.1,10.0.0.3 endpoints in datacenter 0 for token 0

DEBUG [time] 10.0.0.1,10.0.0.3 endpoints in datacenter 0 for token 56713727820156410577229101238628035242

DEBUG [time] 10.0.0.3,10.0.0.1 endpoints in datacenter 0 for token 113427455640312821154458202477256070484

INFO [time] Cleanup cannot run before a node has joined the ring

 

From 10.0.0.3:

DEBUG [time] 10.0.0.1,10.0.0.2 endpoints in datacenter 0 for token 0

DEBUG [time] 10.0.0.1,10.0.0.2 endpoints in datacenter 0 for token 56713727820156410577229101238628035242

DEBUG [time] 10.0.0.2,10.0.0.1 endpoints in datacenter 0 for token 113427455640312821154458202477256070484

INFO [time] Cleanup cannot run before a node has joined the ring

 

For me this means, that one node thinks that whole data range is on other two nodes.

 

As a result:

 

WRITE request with any key/any token sent to 10.0.0.1 controller will be forwarded and saved on 10.0.0.2 and 10.0.0.3

READ request with CL.One with any key/any token sent to 10.0.0.2 controller will be forwarded to 10.0.0.1 or 10.0.0.3, and since 10.0.0.1 can’t have data for write above, some requests fails, some don’t (if 10.0.0.3 answers).

More of it, every READ request to any node will be forwarded to other node.

 

That what we have right now with 0.8.6 and up to 1.0.5 as with 3 nodes in 1 DC, as with 8x2 nodes.

 

 

 

Best regards/ Pagarbiai

 

Viktor Jevdokimov

Senior Developer

 

Email:  Viktor.Jevdokimov@adform.com

Phone: +370 5 212 3063. Fax: +370 5 261 0453

J. Jasinskio 16C, LT-01112 Vilnius, Lithuania

 

 

Adform news

Visit us!

Follow:

twitter

Visit our blog

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.