cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Re: Repair question - why is so much data transferred?
Date Thu, 21 Jul 2011 15:43:32 GMT
from ticket 2818:
"One (reasonably simple) proposition to fix this would be to have repair  
schedule validation compactions across nodes one by one (ie, one CF/range  
at a time), waiting for all nodes to return their tree before submitting  
the next request. Then on each node, we should make sure that the node will  
start the validation compaction as soon as requested. For that, we probably  
want to have a specific executor for validation compaction"

.. This was the way I thought repair worked.

Anyway, in our case, we only have one CF, so I'm not sure if both issues  
apply to my situation.

Thanks. Looking forward to the release where these 2 things are fixed.

On , Jonathan Ellis <> wrote:
> On Thu, Jul 21, 2011 at 9:14 AM, Jonathan Colby

>> wrote:

> > I regularly run repair on my cassandra cluster. However, I often seen  
> that during the repair operation very large amounts of data are  
> transferred to other nodes.



> > My questions is, if only some data is out of sync, why are entire Data  
> files being transferred?

> Repair streams ranges of files as a unit (which becomes a new file on

> the target node) rather than using the normal write path.

> --

> Jonathan Ellis

> Project Chair, Apache Cassandra

> co-founder of DataStax, the source for professional Cassandra support


View raw message