cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cyril Scetbon (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-11933) Improve Repair performance
Date Tue, 31 May 2016 23:17:12 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Cyril Scetbon updated CASSANDRA-11933:
--------------------------------------
    Description: 
During  a full repair on a ~ 60 nodes cluster, I've been able to see that this stage can be
significant (up to 60 percent of the whole time) :

https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997

It's merely caused by the fact that https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it takes more than 99%
of the time. This call takes 600ms when there is no load on the cluster and more if there
is. So for 10k ranges, you can imagine that it takes at least 1.5 hours just to compute ranges.


Underneath it calls [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
which can get pretty inefficient ([~jbellis]'s [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])

*ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend hours on it.

  was:
During  a full repair on a ~ 60 nodes cluster, I've been able to see that this stage can be
significant (up to 60 percent of the whole time) :

https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997

It's merely caused by the fact that https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it takes more than 99%
of the time. This call takes 600ms when there is no load on the cluster and more if there
is. So for 10k ranges, you can imagine that it takes at least 1.5 hours just to compute ranges.


Underneath it calls [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
which can get pretty inefficient.

*ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend hours on it.


> Improve Repair performance
> --------------------------
>
>                 Key: CASSANDRA-11933
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Cyril Scetbon
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that this stage
can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it takes more than 99%
of the time. This call takes 600ms when there is no load on the cluster and more if there
is. So for 10k ranges, you can imagine that it takes at least 1.5 hours just to compute ranges.

> Underneath it calls [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
which can get pretty inefficient ([~jbellis]'s [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend hours on
it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message