Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of rcoli@eventbrite.com
 designates 209.85.216.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <522F32FC.2030804@gmail.com>
References: <522F32FC.2030804@gmail.com>
Date: Tue, 10 Sep 2013 10:17:08 -0700
Message-ID: 
 <CAEDUwd248J4QZinuFOjNowd_6ep3QE_OSdECZ6FFMeVfdtGERA@mail.gmail.com>
Subject: Re: heavy insert load overloads CPUs, with MutationStage pending
From: Robert Coli <rcoli@eventbrite.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Content-Type: multipart/alternative; boundary=047d7bd76dbabeb02704e60aac88

--047d7bd76dbabeb02704e60aac88
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8forty@gmail.com> wrote:

> On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog and
> data


On SSD, you don't need to separate commitlog and data. You only win from
this separation if you have a head to not-move between appends to the
commit log. You will get better IO from a strip with an additional SSD.


> Pool Name                    Active   Pending      Completed   Blocked
>>  All time blocked
>> MutationStage                     1         9         290394         0
>>               0
>> FlushWriter                       1         2             20         0
>>               0
>>
>

> I can't seem find information about the real meaning of MutationStage, is
> this just normal for lots of inserts?
>

The mutation stage is the stage in which mutations to rows in memtables
("writes") occur.

The FlushWriter stage is the stage that turns memtables into SSTables by
flushing them.

However, 9 pending mutations is a very small number. For reference on an
overloaded cluster which was being written to death I recently saw....
1216434 pending MutationStage. What problem other than "high CPU load" are
you experiencing? 2 Pending FlushWriters is slightly suggestive of some
sort of bound related to flushing..


> Also, switching from spinning disks to SSDs didn't seem to significantly
> improve insert performance, so it seems clear my use-case it totally
> CPU-bound.  Cassandra docs say "Insert-heavy workloads are CPU-bound in
> Cassandra before becoming memory-bound.", so I guess that's what I'm
> seeing, but there's no explanation. So I'm wonder what's overloading my
> CPUs, and is there anything I can do about it short of adding more nodes?
>

Insert performance is pretty optimized from an I/O perspective. There is
probably not too much you can do. You can disable durability guarantees if
you truly require insert performance at all costs.

That said, the percentage of people running Cassandra on SSDs is still
relatively low. It is likely that performance improvements wrt CPU usage
are possible.

=Rob

--047d7bd76dbabeb02704e60aac88
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <span dir=
=3D"ltr">&lt;<a href=3D"mailto:8forty@gmail.com" target=3D"_blank">8forty@g=
mail.com</a>&gt;</span> wrote:<br><div class=3D"gmail_extra"><div class=3D"=
gmail_quote">
<blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-=
left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;p=
adding-left:1ex">On my 3-node cluster (v1.2.8) with 4-cores each and SSDs f=
or commitlog and data</blockquote>
<div><br></div><div>On SSD, you don&#39;t need to separate commitlog and da=
ta. You only win from this separation if you have a head to not-move betwee=
n appends to the commit log. You will get better IO from a strip with an ad=
ditional SSD.</div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px=
 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left=
-style:solid;padding-left:1ex"><blockquote class=3D"gmail_quote" style=3D"m=
argin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204=
,204);border-left-style:solid;padding-left:1ex">
Pool Name =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Active =A0 Pending =A0 =A0=
 =A0Completed =A0 Blocked =A0All time blocked<br>MutationStage =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 1 =A0 =A0 =A0 =A0 9 =A0 =A0 =A0 =A0 290394 =A0 =
=A0 =A0 =A0 0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0<br>FlushWriter =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 1 =A0 =A0 =A0 =A0 2 =A0 =A0 =A0 =A0 =A0 =A0=
 20 =A0 =A0 =A0 =A0 0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0<br>
</blockquote></blockquote><div>=A0</div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rg=
b(204,204,204);border-left-style:solid;padding-left:1ex"><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;bo=
rder-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
</blockquote>
I can&#39;t seem find information about the real meaning of MutationStage, =
is this just normal for lots of inserts?<br></blockquote><div><br></div><di=
v>The mutation stage is the stage in which mutations to rows in memtables (=
&quot;writes&quot;) occur.</div>
<div><br></div><div>The FlushWriter stage is the stage that turns memtables=
 into SSTables by flushing them.</div><div><br></div><div>However, 9 pendin=
g mutations is a very small number. For reference on an overloaded cluster =
which was being written to death I recently saw.... 1216434 pending Mutatio=
nStage. What problem other than &quot;high CPU load&quot; are you experienc=
ing? 2 Pending FlushWriters is slightly suggestive of some sort of bound re=
lated to flushing..</div>
<div>=A0<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px=
 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-=
left-style:solid;padding-left:1ex">
Also, switching from spinning disks to SSDs didn&#39;t seem to significantl=
y improve insert performance, so it seems clear my use-case it totally CPU-=
bound. =A0Cassandra docs say &quot;Insert-heavy workloads are CPU-bound in =
Cassandra before becoming memory-bound.&quot;, so I guess that&#39;s what I=
&#39;m seeing, but there&#39;s no explanation. So I&#39;m wonder what&#39;s=
 overloading my CPUs, and is there anything I can do about it short of addi=
ng more nodes?<br>
</blockquote><div><br></div><div>Insert performance is pretty optimized fro=
m an I/O perspective. There is probably not too much you can do. You can di=
sable durability guarantees if you truly require insert performance at all =
costs.</div>
<div><br></div><div>That said, the percentage of people running Cassandra o=
n SSDs is still relatively low. It is likely that performance improvements =
wrt CPU usage are possible.</div><div><br></div><div>=3DRob</div><div>=A0</=
div>
</div></div></div>

--047d7bd76dbabeb02704e60aac88--