incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: AW: Strange nodetool repair behaviour
Date Mon, 04 Apr 2011 12:46:32 GMT
Jonas, AFAIK if repair completed successfully there should be no streaming the next time round.
This sounds odd please look into it if you can. 

Can you run at DEBUG logging, there will be some messages about receiving streams from files
and which ranges are being requested. 

I would be interested to know if the repair is completing successfully. You should see messages
such as "Repair session blah completed successfully"  if it is. It is possible repair to hang
if one of the neighbours goes away or fails to send the data. In this case the repair session
will timeout after 48 hours. 

Aaron

On 4 Apr 2011, at 20:39, Roland Gude wrote:

> I am experiencing the same behavior but had it on previous versions of 0.7 as well.
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Jonas Borgström [mailto:jonas.borgstrom@trioptima.com] 
> Gesendet: Montag, 4. April 2011 12:26
> An: user@cassandra.apache.org
> Betreff: Strange nodetool repair behaviour
> 
> Hi,
> 
> I have a 6 node 0.7.4 cluster with replication_factor=3 where "nodetool
> repair keyspace" behaves really strange.
> 
> The keyspace contains three column families and about 60GB data in total
> (i.e 30GB on each node).
> 
> Even though no data has been added or deleted since the last repair, a
> repair takes hours and the repairing node seems to receive 100+GB worth
> of sstable data from its neighbourhood nodes, i.e several times the
> actual data size.
> 
> The log says things like:
> 
> "Performing streaming repair of 27 ranges"
> 
> And a bunch of:
> 
> "Compacted to <filename> 22,208,983,964 to 4,816,514,033 (~21% of original)"
> 
> In the end the repair finishes without any error after a few hours but
> even then the active sstables seems to contain lots of redundant data
> since the disk usage can be sliced in half by triggering a major compaction.
> 
> All this leads me to believe that something stops the AES from correctly
> figuring out what data is already on the repairing node and what needs
> to be streamed from the neighbours.
> 
> The only thing I can think of right now is that one of the column
> families contains a lot of large rows that are larger than
> memtable_throughput and that's perhaps what's confusing the merkle tree.
> 
> Anyway, is this a known problem of perhaps expected behaviour?
> Otherwise I'll try to create a more reproducible test case.
> 
> Regards,
> Jonas
> 
> 


Mime
View raw message