cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <>
Subject Re: AW: Strange nodetool repair behaviour
Date Tue, 05 Apr 2011 13:49:20 GMT
Sounds like

On Mon, Apr 4, 2011 at 7:46 AM, aaron morton <> wrote:
> Jonas, AFAIK if repair completed successfully there should be no streaming the next time
round. This sounds odd please look into it if you can.
> Can you run at DEBUG logging, there will be some messages about receiving streams from
files and which ranges are being requested.
> I would be interested to know if the repair is completing successfully. You should see
messages such as "Repair session blah completed successfully"  if it is. It is possible repair
to hang if one of the neighbours goes away or fails to send the data. In this case the repair
session will timeout after 48 hours.
> Aaron
> On 4 Apr 2011, at 20:39, Roland Gude wrote:
>> I am experiencing the same behavior but had it on previous versions of 0.7 as well.
>> -----Ursprüngliche Nachricht-----
>> Von: Jonas Borgström []
>> Gesendet: Montag, 4. April 2011 12:26
>> An:
>> Betreff: Strange nodetool repair behaviour
>> Hi,
>> I have a 6 node 0.7.4 cluster with replication_factor=3 where "nodetool
>> repair keyspace" behaves really strange.
>> The keyspace contains three column families and about 60GB data in total
>> (i.e 30GB on each node).
>> Even though no data has been added or deleted since the last repair, a
>> repair takes hours and the repairing node seems to receive 100+GB worth
>> of sstable data from its neighbourhood nodes, i.e several times the
>> actual data size.
>> The log says things like:
>> "Performing streaming repair of 27 ranges"
>> And a bunch of:
>> "Compacted to <filename> 22,208,983,964 to 4,816,514,033 (~21% of original)"
>> In the end the repair finishes without any error after a few hours but
>> even then the active sstables seems to contain lots of redundant data
>> since the disk usage can be sliced in half by triggering a major compaction.
>> All this leads me to believe that something stops the AES from correctly
>> figuring out what data is already on the repairing node and what needs
>> to be streamed from the neighbours.
>> The only thing I can think of right now is that one of the column
>> families contains a lot of large rows that are larger than
>> memtable_throughput and that's perhaps what's confusing the merkle tree.
>> Anyway, is this a known problem of perhaps expected behaviour?
>> Otherwise I'll try to create a more reproducible test case.
>> Regards,
>> Jonas

Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support

View raw message