Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D7C0C9F1 for ; Tue, 10 Sep 2013 17:17:39 +0000 (UTC) Received: (qmail 58171 invoked by uid 500); 10 Sep 2013 17:17:36 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 57853 invoked by uid 500); 10 Sep 2013 17:17:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 57843 invoked by uid 99); 10 Sep 2013 17:17:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 17:17:35 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rcoli@eventbrite.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qc0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 17:17:30 +0000 Received: by mail-qc0-f172.google.com with SMTP id l13so2420085qcy.31 for ; Tue, 10 Sep 2013 10:17:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=8x6aCJ17WN+8E3WRfiIHm2yVZPRgeeJIcJqik12ASMk=; b=dWNse6Fup5BsQIo/isrQfs2DE5FGYTp5hoCncS6+S1BHtVEUSpbfm6CumiJLgeKK0P eCQT7bN0e9Cp9qYxyKmmyVq7Zc7neQb8WpP6gplVN5RY0HqH8j9yE75Fxbwzvj6KRbWs UOGosH3AstpCuMa98wMPIDavpySh/A1NK+PYdxMP1g/aqSJ7WWZJnbAXE7eHqHu3E4oJ 97kSKGSjDF55GjIveh8rSIdp/tXqvH1xMUbWQXZXpPamdfUNsMfm4LEaTf28R0MEfwLj c8yngaXPfCE5bRajUS5io8vWaAGEp5oiML/96DJrpOQAfcCxRBqLDLXHhbyVNJKW75wf ZtbQ== X-Gm-Message-State: ALoCoQnWQjCJNPnwYeRePK4B5lAiisufRPz24LgtOgECJncKbJ1QqJoMqgAsElSFHV/f2gxU7bt6 MIME-Version: 1.0 X-Received: by 10.49.75.103 with SMTP id b7mr5484024qew.85.1378833429069; Tue, 10 Sep 2013 10:17:09 -0700 (PDT) Received: by 10.49.67.70 with HTTP; Tue, 10 Sep 2013 10:17:08 -0700 (PDT) In-Reply-To: <522F32FC.2030804@gmail.com> References: <522F32FC.2030804@gmail.com> Date: Tue, 10 Sep 2013 10:17:08 -0700 Message-ID: Subject: Re: heavy insert load overloads CPUs, with MutationStage pending From: Robert Coli To: "user@cassandra.apache.org" Content-Type: multipart/alternative; boundary=047d7bd76dbabeb02704e60aac88 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd76dbabeb02704e60aac88 Content-Type: text/plain; charset=ISO-8859-1 On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8forty@gmail.com> wrote: > On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog and > data On SSD, you don't need to separate commitlog and data. You only win from this separation if you have a head to not-move between appends to the commit log. You will get better IO from a strip with an additional SSD. > Pool Name Active Pending Completed Blocked >> All time blocked >> MutationStage 1 9 290394 0 >> 0 >> FlushWriter 1 2 20 0 >> 0 >> > > I can't seem find information about the real meaning of MutationStage, is > this just normal for lots of inserts? > The mutation stage is the stage in which mutations to rows in memtables ("writes") occur. The FlushWriter stage is the stage that turns memtables into SSTables by flushing them. However, 9 pending mutations is a very small number. For reference on an overloaded cluster which was being written to death I recently saw.... 1216434 pending MutationStage. What problem other than "high CPU load" are you experiencing? 2 Pending FlushWriters is slightly suggestive of some sort of bound related to flushing.. > Also, switching from spinning disks to SSDs didn't seem to significantly > improve insert performance, so it seems clear my use-case it totally > CPU-bound. Cassandra docs say "Insert-heavy workloads are CPU-bound in > Cassandra before becoming memory-bound.", so I guess that's what I'm > seeing, but there's no explanation. So I'm wonder what's overloading my > CPUs, and is there anything I can do about it short of adding more nodes? > Insert performance is pretty optimized from an I/O perspective. There is probably not too much you can do. You can disable durability guarantees if you truly require insert performance at all costs. That said, the percentage of people running Cassandra on SSDs is still relatively low. It is likely that performance improvements wrt CPU usage are possible. =Rob --047d7bd76dbabeb02704e60aac88 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8forty@g= mail.com> wrote:
On my 3-node cluster (v1.2.8) with 4-cores each and SSDs f= or commitlog and data

On SSD, you don't need to separate commitlog and da= ta. You only win from this separation if you have a head to not-move betwee= n appends to the commit log. You will get better IO from a strip with an ad= ditional SSD.
=A0
Pool Name =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Active =A0 Pending =A0 =A0= =A0Completed =A0 Blocked =A0All time blocked
MutationStage =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 1 =A0 =A0 =A0 =A0 9 =A0 =A0 =A0 =A0 290394 =A0 = =A0 =A0 =A0 0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0
FlushWriter =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 1 =A0 =A0 =A0 =A0 2 =A0 =A0 =A0 =A0 =A0 =A0= 20 =A0 =A0 =A0 =A0 0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0
=A0
I can't seem find information about the real meaning of MutationStage, = is this just normal for lots of inserts?

The mutation stage is the stage in which mutations to rows in memtables (= "writes") occur.

The FlushWriter stage is the stage that turns memtables= into SSTables by flushing them.

However, 9 pendin= g mutations is a very small number. For reference on an overloaded cluster = which was being written to death I recently saw.... 1216434 pending Mutatio= nStage. What problem other than "high CPU load" are you experienc= ing? 2 Pending FlushWriters is slightly suggestive of some sort of bound re= lated to flushing..
=A0
Also, switching from spinning disks to SSDs didn't seem to significantl= y improve insert performance, so it seems clear my use-case it totally CPU-= bound. =A0Cassandra docs say "Insert-heavy workloads are CPU-bound in = Cassandra before becoming memory-bound.", so I guess that's what I= 'm seeing, but there's no explanation. So I'm wonder what's= overloading my CPUs, and is there anything I can do about it short of addi= ng more nodes?

Insert performance is pretty optimized fro= m an I/O perspective. There is probably not too much you can do. You can di= sable durability guarantees if you truly require insert performance at all = costs.

That said, the percentage of people running Cassandra o= n SSDs is still relatively low. It is likely that performance improvements = wrt CPU usage are possible.

=3DRob
=A0
--047d7bd76dbabeb02704e60aac88--