Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5A4A106D2 for ; Fri, 6 Dec 2013 17:05:46 +0000 (UTC) Received: (qmail 18851 invoked by uid 500); 6 Dec 2013 17:05:41 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 18538 invoked by uid 500); 6 Dec 2013 17:05:36 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 18472 invoked by uid 99); 6 Dec 2013 17:05:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Dec 2013 17:05:34 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vicky.kak@gmail.com designates 209.85.220.178 as permitted sender) Received: from [209.85.220.178] (HELO mail-vc0-f178.google.com) (209.85.220.178) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Dec 2013 17:05:28 +0000 Received: by mail-vc0-f178.google.com with SMTP id lh4so1024082vcb.9 for ; Fri, 06 Dec 2013 09:05:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=iOZnrIVNC6ommuQS4eIneYBIeemU1QGZstzvKW5tBO4=; b=k4Nsd1O/MmTz+F738eVFqne2eLu41O+T1S7+ZzsmJ/rJtYegfe6rv82GTfpd/r8W8H FHYE/uG0Ksve0oRoHTfA9Ne61pYHgIZTU++XgJpOkgxJLcpx67f2jubLsC0DqoB9bJbt Lkf2tAqYepvp/tS4iN2TTw6SxhYoKdbylMuG5rBVW4LVMUSOcGyt9DtvP+0zSNK9nZXB fTjIFrRwp0VIEv0aAT1HXgFJ/0TlVRcahFulQwFgezM+g2VWKrcc8aeqdh1Igu7+jvYH hX9BHBnu4IRA4XJWwaewwtK5cVRR7uNPQhoQrnefqbc3b6IbVfQyjrxHtTgMnUU9G678 HqRQ== MIME-Version: 1.0 X-Received: by 10.58.96.15 with SMTP id do15mr1858545veb.56.1386349507531; Fri, 06 Dec 2013 09:05:07 -0800 (PST) Received: by 10.220.119.204 with HTTP; Fri, 6 Dec 2013 09:05:07 -0800 (PST) In-Reply-To: References: Date: Fri, 6 Dec 2013 22:35:07 +0530 Message-ID: Subject: Re: Write performance with 1.2.12 From: Vicky Kak To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e0122f5d4ee879c04ece0a543 X-Virus-Checked: Checked by ClamAV on apache.org --089e0122f5d4ee879c04ece0a543 Content-Type: text/plain; charset=ISO-8859-1 Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3 which is 8/3 ~ 2.6 gb in capacity http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management The flushing of 2.6 gb to the disk might slow the performance if frequently called, may be you have lots of write operations going on. On Fri, Dec 6, 2013 at 10:06 PM, srmore wrote: > > > > On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak wrote: > >> You have passed the JVM configurations and not the cassandra >> configurations which is in cassandra.yaml. >> > > Apologies, was tuning JVM and that's what was in my mind. > Here are the cassandra settings http://pastebin.com/uN42GgYT > > > >> The spikes are not that significant in our case and we are running the >> cluster with 1.7 gb heap. >> >> Are these spikes causing any issue at your end? >> > > There are no big spikes, the overall performance seems to be about 40% low. > > >> >> >> >> >> On Fri, Dec 6, 2013 at 9:10 PM, srmore wrote: >> >>> >>> >>> >>> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak wrote: >>> >>>> Hard to say much without knowing about the cassandra configurations. >>>> >>> >>> The cassandra configuration is >>> -Xms8G >>> -Xmx8G >>> -Xmn800m >>> -XX:+UseParNewGC >>> -XX:+UseConcMarkSweepGC >>> -XX:+CMSParallelRemarkEnabled >>> -XX:SurvivorRatio=4 >>> -XX:MaxTenuringThreshold=2 >>> -XX:CMSInitiatingOccupancyFraction=75 >>> -XX:+UseCMSInitiatingOccupancyOnly >>> >>> >>> >>>> Yes compactions/GC's could skipe the CPU, I had similar behavior with >>>> my setup. >>>> >>> >>> Were you able to get around it ? >>> >>> >>>> >>>> -VK >>>> >>>> >>>> On Fri, Dec 6, 2013 at 7:40 PM, srmore wrote: >>>> >>>>> We have a 3 node cluster running cassandra 1.2.12, they are pretty big >>>>> machines 64G ram with 16 cores, cassandra heap is 8G. >>>>> >>>>> The interesting observation is that, when I send traffic to one node >>>>> its performance is 2x more than when I send traffic to all the nodes. We >>>>> ran 1.0.11 on the same box and we observed a slight dip but not half as >>>>> seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. >>>>> Changing CL to ONE make a slight improvement but not much. >>>>> >>>>> The read_Repair_chance is 0.1. We see some compactions running. >>>>> >>>>> following is my iostat -x output, sda is the ssd (for commit log) and >>>>> sdb is the spinner. >>>>> >>>>> avg-cpu: %user %nice %system %iowait %steal %idle >>>>> 66.46 0.00 8.95 0.01 0.00 24.58 >>>>> >>>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz >>>>> avgqu-sz await svctm %util >>>>> sda 0.00 27.60 0.00 4.40 0.00 256.00 >>>>> 58.18 0.01 2.55 1.32 0.58 >>>>> sda1 0.00 0.00 0.00 0.00 0.00 0.00 >>>>> 0.00 0.00 0.00 0.00 0.00 >>>>> sda2 0.00 27.60 0.00 4.40 0.00 256.00 >>>>> 58.18 0.01 2.55 1.32 0.58 >>>>> sdb 0.00 0.00 0.00 0.00 0.00 0.00 >>>>> 0.00 0.00 0.00 0.00 0.00 >>>>> sdb1 0.00 0.00 0.00 0.00 0.00 0.00 >>>>> 0.00 0.00 0.00 0.00 0.00 >>>>> dm-0 0.00 0.00 0.00 0.00 0.00 0.00 >>>>> 0.00 0.00 0.00 0.00 0.00 >>>>> dm-1 0.00 0.00 0.00 0.60 0.00 4.80 >>>>> 8.00 0.00 5.33 2.67 0.16 >>>>> dm-2 0.00 0.00 0.00 0.00 0.00 0.00 >>>>> 0.00 0.00 0.00 0.00 0.00 >>>>> dm-3 0.00 0.00 0.00 24.80 0.00 198.40 >>>>> 8.00 0.24 9.80 0.13 0.32 >>>>> dm-4 0.00 0.00 0.00 6.60 0.00 52.80 >>>>> 8.00 0.01 1.36 0.55 0.36 >>>>> dm-5 0.00 0.00 0.00 0.00 0.00 0.00 >>>>> 0.00 0.00 0.00 0.00 0.00 >>>>> dm-6 0.00 0.00 0.00 24.80 0.00 198.40 >>>>> 8.00 0.29 11.60 0.13 0.32 >>>>> >>>>> >>>>> >>>>> I can see I am cpu bound here but couldn't figure out exactly what is >>>>> causing it, is this caused by GC or Compaction ? I am thinking it is >>>>> compaction, I see a lot of context switches and interrupts in my vmstat >>>>> output. >>>>> >>>>> I don't see GC activity in the logs but see some compaction activity. >>>>> Has anyone seen this ? or know what can be done to free up the CPU. >>>>> >>>>> Thanks, >>>>> Sandeep >>>>> >>>>> >>>>> >>>> >>> >> > --089e0122f5d4ee879c04ece0a543 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


On Fr= i, Dec 6, 2013 at 10:06 PM, srmore <comomore@gmail.com> wro= te:



On Fri, Dec 6, 201= 3 at 9:59 AM, Vicky Kak <vicky.kak@gmail.com> wrote:
You have passed the JVM configurations and not the cassandra configuratio= ns which is in cassandra.yaml.

Apologies, was tuning JV= M and that's what was in my mind.
Here are the cassandra sett= ings http://past= ebin.com/uN42GgYT

=A0
The spikes are not that significant = in our case and we are running the cluster with 1.7 gb heap.

Are these spikes causing any issue at your end?

There are no big spikes, the over= all performance seems to be about 40% low.
=A0


<= /div>


On Fri, Dec 6, 2013 at 9:10 PM, srmore <comomore@gmail.com>= ; wrote:



On Fri, Dec = 6, 2013 at 9:32 AM, Vicky Kak <vicky.kak@gmail.com> wrote:=
Hard to say much without knowing about the cassandra configurations.
<= /div>
=A0
The cassandra configuration is= =A0
-Xms8G
-Xmx8G
-Xmn800m
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=3D4
-XX:MaxTenuringThreshold=3D2
-XX:CMSInitiatingOccupancyFraction=3D75
-XX:+UseCMSInitiatingOccupancyOnly

=A0
Yes compactions/GC's could skipe the CPU, I had similar behavior with m= y setup.

Were you a= ble to get around it ?
=A0

-VK


On Fri, Dec 6, 2013 at 7:40 PM, srmore <= span dir=3D"ltr"><comomore@gmail.com> wrote:
We have a 3 node cluster running cassandra 1.2.1= 2, they are pretty big machines 64G ram with 16 cores, cassandra heap is 8G= .

The interesting observation is that, when I send traffic to= one node its performance is 2x more than when I send traffic to all the no= des. We ran 1.0.11 on the same box and we observed a slight dip but not hal= f as seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.= Changing CL to ONE make a slight improvement but not much.

The read_Repair_chance is 0.1. We see some compactions running.
following is my iostat -x output, sda is the ssd = (for commit log) and sdb is the spinner.

avg-cpu:=A0 %user=A0=A0 %ni= ce %system %iowait=A0 %steal=A0=A0 %idle
=A0=A0=A0=A0=A0=A0=A0=A0=A0 66.46=A0=A0=A0 0.00=A0=A0=A0 8.95=A0=A0=A0 0.01= =A0=A0=A0 0.00=A0=A0 24.58

Device:=A0=A0=A0=A0=A0=A0=A0=A0 rrqm/s=A0= =A0 wrqm/s=A0=A0 r/s=A0=A0 w/s=A0=A0 rsec/s=A0=A0 wsec/s avgrq-sz avgqu-sz= =A0=A0 await=A0 svctm=A0 %util
sda=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 0.00=A0=A0=A0 27.60=A0 0.00=A0 4.40=A0=A0=A0=A0 0.00=A0=A0 256.00=A0= =A0=A0 58.18=A0=A0=A0=A0 0.01=A0=A0=A0 2.55=A0=A0 1.32=A0=A0 0.58
sda1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00= =A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0= 0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00
sda2=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 0.00=A0=A0=A0 27.60=A0 0.00=A0 4.40=A0=A0=A0=A0 0.00=A0=A0 256.00= =A0=A0=A0 58.18=A0=A0=A0=A0 0.01=A0=A0=A0 2.55=A0=A0 1.32=A0=A0 0.58
sdb= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0= 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00= =A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00
sdb1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00= =A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0= 0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00
dm-0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0= =A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0= .00
dm-1=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0= 0.00=A0 0.60=A0=A0=A0=A0 0.00=A0=A0=A0=A0 4.80=A0=A0=A0=A0 8.00=A0=A0=A0= =A0 0.00=A0=A0=A0 5.33=A0=A0 2.67=A0=A0 0.16
dm-2=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00= =A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0= 0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00
dm-3=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00 24.80=A0=A0=A0=A0 0.00=A0=A0 198.40= =A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.24=A0=A0=A0 9.80=A0=A0 0.13=A0=A0 0.32
d= m-4=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00=A0= 6.60=A0=A0=A0=A0 0.00=A0=A0=A0 52.80=A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.01=A0= =A0=A0 1.36=A0=A0 0.55=A0=A0 0.36
dm-5=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00= =A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0=A0=A0=A0 0.0= 0=A0=A0=A0 0.00=A0=A0 0.00=A0=A0 0.00
dm-6=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 0.00=A0=A0=A0=A0 0.00=A0 0.00 24.80=A0=A0=A0=A0 0.00=A0=A0 198.40= =A0=A0=A0=A0 8.00=A0=A0=A0=A0 0.29=A0=A0 11.60=A0=A0 0.13=A0=A0 0.32


I can see I am cpu bound here but couldn't figure out exactl= y what is causing it, is this caused by GC or Compaction ? I am thinking it= is compaction, I see a lot of context switches and interrupts in my vmstat= output.

I don't see GC activity in the logs but see some compact= ion activity. Has anyone seen this ? or know what can be done to free up th= e CPU.

Thanks,
Sandeep







--089e0122f5d4ee879c04ece0a543--