cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sam Overton (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3881) reduce computational complexity of processing topology changes
Date Tue, 15 May 2012 15:45:44 GMT


Sam Overton commented on CASSANDRA-3881:

First the important bit: With these patches, StorageService.calculatePendingRanges is almost
three orders of magnitude faster when calculating two nodes bootstrapping into a cluster with
2048 nodes (22ms vs 14.6sec). See the [graph here|].
This was tested with 1 DC and 1 rack with RF=2.

The problem lies in NetworkTopologyStrategy.calculateNaturalEndpoints. The main problems with
the existing implementation are:

1. for each datacentre:
2. it iterates through all the tokens in the ring at least once
3. then does an NlogN sort of those tokens
4. then if number of racks < RF it will iterate through all tokens again because it can't
tell if it has exhausted the racks in that DC
5. then if number of hosts in that DC < RF it will iterate through all tokens again, otherwise
it will iterate through until it has RF hosts in that DC

so it's doing O(DC * (N + NlogN + N + N)) operations just to work out the endpoints for a
single token. StorageService.calculatePendingRanges then puts this inside other loops (such
as AbstractReplicationStrategy.getAddressRanges) which makes it at least O(N^2*logN).

These patches fix (1) by iterating through the tokens only once, and processing all DCs simultaneously.

(2,3&5) relate to knowing which endpoints exist in a given DC, (4) relates to knowing
which racks appear in a DC, so the first patch adds this knowledge to the snitch. The second
patch makes use of this knowledge to simplify calculateNaturalEndpoints.

||branch|| || ||description
some functionality to AbstractEndpointSnitch to track which endpoints and racks exist in a
DC (allows for fixing of 2-5).
O(logN) implementation of calculateNaturalEndpoints using the topology information from the

Note: These branches are managed with [TopGit|].  If you are
applying the patch output manually, you will either need to filter the TopGit metadata files
( {{wget -O - <url> | filterdiff -x*.topdeps -x*.topmsg | patch -p1}} ), or remove them
afterward ( {{rm .topmsg .topdeps}} ).
> reduce computational complexity of processing topology changes
> --------------------------------------------------------------
>                 Key: CASSANDRA-3881
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Sam Overton
> This constitutes follow-up work from CASSANDRA-3831 where a partial improvement was committed,
but the fundamental issue was not fixed. The maximum "practical" cluster size was significantly
improved, but further work is expected to be necessary as cluster sizes grow.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message