incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: High CPU usage on all nodes without any read or write
Date Tue, 13 Jul 2010 20:11:25 GMT
did you look at compaction activity?

On Mon, Jul 12, 2010 at 9:31 AM, Olivier Rosello <orosello@corp.free.fr> wrote:
>> > But in Cassandra output log :
>> > root@cassandra-2:~#  tail -f /var/log/cassandra/output.log
>> >  INFO 15:32:05,390 GC for ConcurrentMarkSweep: 1359 ms, 4295787600
>> reclaimed leaving 1684169392 used; max is 6563430400
>> >  INFO 15:32:09,875 GC for ConcurrentMarkSweep: 1363 ms, 4296991416
>> reclaimed leaving 1684201560 used; max is 6563430400
>> >  INFO 15:32:14,370 GC for ConcurrentMarkSweep: 1341 ms, 4295467880
>> reclaimed leaving 1684879440 used; max is 6563430400
>> >  INFO 15:32:18,906 GC for ConcurrentMarkSweep: 1343 ms, 4296386408
>> reclaimed leaving 1685489208 used; max is 6563430400
>> >  INFO 15:32:23,564 GC for ConcurrentMarkSweep: 1511 ms, 4296407088
>> reclaimed leaving 1685488744 used; max is 6563430400
>> >  INFO 15:32:28,068 GC for ConcurrentMarkSweep: 1347 ms, 4295383216
>> reclaimed leaving 1686469448 used; max is 6563430400
>> >  INFO 15:32:32,617 GC for ConcurrentMarkSweep: 1376 ms, 4295689192
>> reclaimed leaving 1687908304 used; max is 6563430400
>> >  INFO 15:32:37,283 GC for ConcurrentMarkSweep: 1468 ms, 4296056176
>> reclaimed leaving 1687916880 used; max is 6563430400
>> >  INFO 15:32:41,811 GC for ConcurrentMarkSweep: 1358 ms, 4296412232
>> reclaimed leaving 1688437064 used; max is 6563430400
>> >  INFO 15:32:46,436 GC for ConcurrentMarkSweep: 1368 ms, 4296105472
>> reclaimed leaving 1691050032 used; max is 6563430400
>> >  INFO 15:32:51,180 GC for ConcurrentMarkSweep: 1545 ms, 4297439832
>> reclaimed leaving 1691033816 used; max is 6563430400
>> >  INFO 15:32:55,703 GC for ConcurrentMarkSweep: 1379 ms, 4295491928
>> reclaimed leaving 1692891456 used; max is 6563430400
>> >  INFO 15:33:00,328 GC for ConcurrentMarkSweep: 1378 ms, 4296657208
>> reclaimed leaving 1694981528 used; max is 6563430400
>>
>> Note that those are ConcurrentMarkSweep GC:s rather than ParNew:s, so
>> should be running concurrently with the application and should not
>> correlate to 1.3 second pauses for the application.
>
> When I have this behaviour (ConcurrentMarkSweep, high CPU...) Cassandra is running but
there is no write, no read since hours... (I stopped read & writes when the behaviour
started).
>
> Even after a wipe of data on all nodes, the behaviour started to happen again after some
hours of writing... :-(
>
>
>> As for the discrepancy between nodes, are all nodes handling a
>> similar
>> amount of traffic? I briefly checked your original post and you said
>> you're doing TimeUUID insertions. I don't remember off hand, and a
>> quick google didn't tell me, whether there is something specialy
>> about
>> the TimeUUID type that would prevent it - but normally if you're
>> using
>> an OrderedPartitioner you may simply be writing all your data to a
>> single node for token space division reasons and the fact that
>> timestamps are highly ordered.
>
> Theorically yes. But in fact, this behaviour happens first to heavier nodes (those which
have the more important quantity of data).
>
>> How big a latency are we talking about in the cases where you're
>> timing out (i.e., what's the timeout)? Were the timeouts on reads,
>> writes or both?
>
> It's TimeOutExceptions on write (using C++ code -> thrift -> cassandra). This cluster
is used at 99% to handle writes.
>
> How could I get/mesure latency ?
>
>
> Olivier
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message