cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paulo Motta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
Date Thu, 16 Jun 2016 23:20:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334937#comment-15334937
] 

Paulo Motta commented on CASSANDRA-12015:
-----------------------------------------

bq.  However, beware of different RF in different DCs. You may have RF=3 in source DC and
RF=5 in target DC, what will be the paired replica of the 4th replica of target DC ? Maybe
use some modulo function. Same kind of issue if target DC RF > source DC RF. 

hmm good point. it seems this might be a bit harder than initially thought...

I suggest we restrict this ticket to avoid using dynamic snitch proximity to pick replicas
to stream from, which would already prevent hotspots and help in the reported case, and tackle
the more general problem of load balancing replica selection in another ticket

> Rebuilding from another DC should use different sources
> -------------------------------------------------------
>
>                 Key: CASSANDRA-12015
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12015
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Fabien Rousseau
>
> Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing DC (ex:
DC1), only the closest replica is used as a "source of data".
> It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, only one
node in DC1 is streaming the data to DC2. 
> To build the new DC in a reasonable time, it would be better, in that case, to stream
from multiple sources, thus distributing more evenly the load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message