cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alain RODRIGUEZ <arodr...@gmail.com>
Subject Re: Compaction -> CPU load 100% -> time out
Date Tue, 22 Nov 2011 15:58:33 GMT
I followed your advice and install a 3 m1.small instance cluster. The
problem is still there. I've got less timeouts because I have less
compaction due to a bigger amount of memory usable before flushing, but
when a compaction starts, I can reach 95% of the cpu used, which produces
timeouts. The compaction run faster, so I have less time out but they are
still some.

Is there really no way to turn compaction into a background and low CPU
consumption task ?

What kind of information can I give you to help you understanding what is
going on with these timeouts ?

2011/11/15 Dan Hendry <dan.hendry.junk@gmail.com>

> I really don’t recommend using t1.micros. The problem with them is that
> they have CPU bursting, basically meaning you get lots of CPU resources for
> a short time but if you use more than you have been allocated you get
> basically nothing for 10+ seconds afterwards. By ‘basically nothing’ I
> really mean that – the machine is effectively dead. The biggest problem
> with this (which we found out the hard way, within a test environment
> thankfully) is that it makes capacity planning extremely difficult – the
> line between having a cluster with sufficient capacity and being overloaded
> is extremely abrupt and very difficult to see coming. Moreover once you are
> over capacity, the ‘dead periods caused’ by CPU bursting cause things
> spiral out of control rapidly due to overtly aggressive client retries and
> hinted handoff increasing overall load (although the HH problem might have
> improved with 1.0.x). I would recommend m1.smalls at the very least.****
>
> ** **
>
> If you are set on micros, make sure you only ever trigger compaction on
> one node at a time (or better, consider if you even need to trigger major
> compactions at all), set compaction_throughput_mb_per_sec (cassandra.yaml)
> as low as you possibly can (1 is the minimum I believe), try disabling
> hinted handoff (on all nodes), and use lower read/write consistency levels
> if you can.****
>
> ** **
>
> Dan****
>
> ** **
>
> *From:* Alain RODRIGUEZ [mailto:arodrime@gmail.com]
> *Sent:* November-15-11 6:34
> *To:* user@cassandra.apache.org
> *Subject:* Compaction -> CPU load 100% -> time out****
>
> ** **
>
> Hi, I'm running a 3 node cassandra 1.0.2 cluster on 3 Amazon EC2 t1.micro.
> ****
>
> ** **
>
> I managed to fix some OOM I had, but I still have some spike of cpu load.*
> ***
>
> ** **
>
> I know that t1.micro have small resources, but I think it could be enough
> if they were well managed.****
>
> ** **
>
> My application works well, excepted when cassandra need to run a
> compaction on a node. To do it, Cassandra uses 100% of the cpu, generating
> a lot of time out. My time out is configured to 250 ms with 2 attempt max.
> I'm running in production, our actual system use MySQL and we are trying to
> replace MySQLwith Cassandra. Cassandra musn't slow down the production
> environnement while we use both DB in parallel, that is why I can't
> increase the time before a time out.****
>
> ** **
>
> Running this compaction in background somehow could be a good idea, after
> my seach about this subject, I tried by adding JVM_OPTS="$JVM_OPTS
> -Dcassandra.compaction.priority=1" to the cassandra-env.sh****
>
> ** **
>
> This option was added for Cassandra 0.6.3, is it still usefull ? It
> doesn't resolve my problem.****
>
> ** **
>
> Anyways, this doesn't help while performing a nodetool repair, the cpu
> load is still 100%.****
>
> ** **
>
> Is there a way to turn these exceptional tasks into backgrounds tasks,
> using only available cpu ?****
>
> ** **
>
> Is there a way to get Cassandra working properly on EC2 t1.micros ?****
>
> ** **
>
> Thanks,****
>
> ** **
>
> Alain****
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.920 / Virus Database: 271.1.1/4017 - Release Date: 11/14/11
> 14:34:00****
>

Mime
View raw message