cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roger Schildmeijer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10423) Paxos/LWT failures when moving node
Date Fri, 16 Oct 2015 10:17:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960477#comment-14960477
] 

Roger Schildmeijer commented on CASSANDRA-10423:
------------------------------------------------

We had to do yet another (nodetool) move. Same thing happened.
We move a node from 6362172968960304802 to 4611686018427387907

It used to be between (node) 6148914691236517208 and (node) 7686143364045646509.
It moved between (node) 3074457345618258605 and (node) 6148914691236517208

Some (sorted) tokens (from the lwt queries) that failed:
1550752142907493170
1681261686482955214
1787784122186449673
2206896992809998407
2679778263008234502
3440226803292810454
3551446884592709276

My non scientific conclusion is that all lwt queries, that ended up in a certain range, failed.


> Paxos/LWT failures when moving node
> -----------------------------------
>
>                 Key: CASSANDRA-10423
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10423
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra version: 2.0.14
> Java-driver version: 2.0.11
>            Reporter: Roger Schildmeijer
>            Assignee: Ryan McGuire
>
> While moving a node (nodetool move <newtoken>) we noticed that lwt started failing
for some (~50%) requests. The java-driver (version 2.0.11) returned com.datastax.driver.core.exceptions.WriteTimeoutException:
Cassandra timeout during write query at consistency SERIAL (7 replica were required but only
0 acknowledged the write). The cluster was not under heavy load.
> I noticed that the failed lwt requests all took just above 1s. That information and the
WriteTimeoutException could indicate that this happens:
> https://github.com/apache/cassandra/blob/cassandra-2.0.14/src/java/org/apache/cassandra/service/StorageProxy.java#L268
> I can't explain why though. Why would there be more cas contention just because a node
is moving?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message