cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9258) Range movement causes CPU & performance impact
Date Mon, 14 Dec 2015 16:23:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056216#comment-15056216
] 

Branimir Lambov commented on CASSANDRA-9258:
--------------------------------------------

The solution looks good overall.

{{PendingRangeMaps}} could be declared iterable so that the for-in syntax can be used in e.g.
{{TokenMetadata.getPendingRangesMM}}.

The comparators should not be created by the constructor as they can just as well be static
final constants. It is not very clear why you need 4 of them, a comment explaining what each
ordering does would help a lot (e.g. sorts ends ascending, and its secondary comparison for
starts is chosen in such a way that and <end, end> ranges come before non-wraparound
intervals with the same end).

{{PendingRangeMaps.addPendingRange}} needs a comment on why adding the address is done only
once in the found case. Or restate it as finding the list to add to, creating if necessary,
then adding the new address, for example:
{code}
List<InetAddress> addresses = ascending.get(range);
if (addresses == null)
{
    addresses = new ArrayList<InetAddress>(1);
    ascending.put(range, addresses);
    descending.put(range, addresses);
}
addresses.add(address);
{code}

In {{PendingRangeMaps.pendingEndpointsFor}} the {{keySet, containsKey, get}} sequence is a
bit more expensive than it could be: either use {{entrySet}} to avoid {{smaller.get}}, or
{{bigger.get}} instead of {{containsKey}} to directly use the value if it's not null. Similarly,
the two wrap-around loops should be on the {{entrySet}} s to avoid unnecessarily searching
for the entry again.


> Range movement causes CPU & performance impact
> ----------------------------------------------
>
>                 Key: CASSANDRA-9258
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9258
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Cassandra 2.1.4
>            Reporter: Rick Branson
>            Assignee: Dikang Gu
>             Fix For: 2.1.x
>
>         Attachments: 0001-pending-ranges-map.patch
>
>
> Observing big CPU & latency regressions when doing range movements on clusters with
many tens of thousands of vnodes. See CPU usage increase by ~80% when a single node is being
replaced.
> Top methods are:
> 1) Ljava/math/BigInteger;.compareTo in Lorg/apache/cassandra/dht/ComparableObjectToken;.compareTo

> 2) Lcom/google/common/collect/AbstractMapBasedMultimap;.wrapCollection in Lcom/google/common/collect/AbstractMapBasedMultimap$AsMap$AsMapIterator;.next
> 3) Lorg/apache/cassandra/db/DecoratedKey;.compareTo in Lorg/apache/cassandra/dht/Range;.contains
> Here's a sample stack from a thread dump:
> {code}
> "Thrift:50673" daemon prio=10 tid=0x00007f2f20164800 nid=0x3a04af runnable [0x00007f2d878d0000]
>    java.lang.Thread.State: RUNNABLE
>       at org.apache.cassandra.dht.Range.isWrapAround(Range.java:260)
>       at org.apache.cassandra.dht.Range.contains(Range.java:51)
>       at org.apache.cassandra.dht.Range.contains(Range.java:110)
>       at org.apache.cassandra.locator.TokenMetadata.pendingEndpointsFor(TokenMetadata.java:916)
>       at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:775)
>       at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:541)
>       at org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:616)
>       at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1101)
>       at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1083)
>       at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:976)
>       at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996)
>       at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3980)
>       at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>       at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>       at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
>       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message