incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@eventbrite.com>
Subject Re: heavy insert load overloads CPUs, with MutationStage pending
Date Tue, 10 Sep 2013 17:17:08 GMT
On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8forty@gmail.com> wrote:

> On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog and
> data


On SSD, you don't need to separate commitlog and data. You only win from
this separation if you have a head to not-move between appends to the
commit log. You will get better IO from a strip with an additional SSD.


> Pool Name                    Active   Pending      Completed   Blocked
>>  All time blocked
>> MutationStage                     1         9         290394         0
>>               0
>> FlushWriter                       1         2             20         0
>>               0
>>
>

> I can't seem find information about the real meaning of MutationStage, is
> this just normal for lots of inserts?
>

The mutation stage is the stage in which mutations to rows in memtables
("writes") occur.

The FlushWriter stage is the stage that turns memtables into SSTables by
flushing them.

However, 9 pending mutations is a very small number. For reference on an
overloaded cluster which was being written to death I recently saw....
1216434 pending MutationStage. What problem other than "high CPU load" are
you experiencing? 2 Pending FlushWriters is slightly suggestive of some
sort of bound related to flushing..


> Also, switching from spinning disks to SSDs didn't seem to significantly
> improve insert performance, so it seems clear my use-case it totally
> CPU-bound.  Cassandra docs say "Insert-heavy workloads are CPU-bound in
> Cassandra before becoming memory-bound.", so I guess that's what I'm
> seeing, but there's no explanation. So I'm wonder what's overloading my
> CPUs, and is there anything I can do about it short of adding more nodes?
>

Insert performance is pretty optimized from an I/O perspective. There is
probably not too much you can do. You can disable durability guarantees if
you truly require insert performance at all costs.

That said, the percentage of people running Cassandra on SSDs is still
relatively low. It is likely that performance improvements wrt CPU usage
are possible.

=Rob

Mime
View raw message