Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B1A1DF65 for ; Fri, 19 Oct 2012 17:59:34 +0000 (UTC) Received: (qmail 42020 invoked by uid 500); 19 Oct 2012 17:59:32 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 42003 invoked by uid 500); 19 Oct 2012 17:59:31 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 41995 invoked by uid 99); 19 Oct 2012 17:59:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Oct 2012 17:59:31 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of btalbot@aeriagames.com designates 74.125.149.244 as permitted sender) Received: from [74.125.149.244] (HELO na3sys009aog118.obsmtp.com) (74.125.149.244) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 19 Oct 2012 17:59:25 +0000 Received: from mail-ye0-f198.google.com ([209.85.213.198]) (using TLSv1) by na3sys009aob118.postini.com ([74.125.148.12]) with SMTP ID DSNKUIGU6NWi9vIRtFgnaJN0QLXIMT14bKKX@postini.com; Fri, 19 Oct 2012 10:59:05 PDT Received: by mail-ye0-f198.google.com with SMTP id q10so543036yen.1 for ; Fri, 19 Oct 2012 10:59:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aeriagames.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Fo38XoZm10Kqmce52N640TkwDXxEfgT782eH3zdKtZ8=; b=jwzjAJRmrhLCLedqiTpA4k6LeWVndYFlgaBI3Pbky/kVApqwScoGcmJj2L0e41nx/b y9Bctb6ZmZ2i8ZqwMP2s8ISgpBpKLb9iWZrmWHLHcvChvneMlTqH2H43oig+QFFWTJNy usEnO2swhyLFoNkeJ+0/naTKY7VielLQs8A8A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=Fo38XoZm10Kqmce52N640TkwDXxEfgT782eH3zdKtZ8=; b=TN171grtuXXcSNrnffKuDf4eGdCDmXiAjJjHSV8iFZUNih5iCvNqyMSjcQo1O/gAXL A2nhw3u+5/5sCPeFd2enWjqM6O+y/v8nhT3hh+XXtD3VRHcCMS7GwPOLpMny9KVNp2Lu pPZMwZo1KtJGaomAo8vjPn068wX6ZZJ4EJ3YSrXH1W7Kf52ujd1Uny05/TgTz3wT4NhK 1RitxCVRik9ExOTt00ndLh5f/smdYCB8Vl75a3CERmwY5UQhuAWw1eIqeW7z+/LyrwWs VGcyE/SPUzMhhWNsm4UlYCkIiWOTEvYHR+Ga/ALKe0/0dApw8ORrPQfeSrETeTDHcPFu 1dPw== Received: by 10.58.4.131 with SMTP id k3mr2696320vek.54.1350669542997; Fri, 19 Oct 2012 10:59:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.58.4.131 with SMTP id k3mr2696310vek.54.1350669542867; Fri, 19 Oct 2012 10:59:02 -0700 (PDT) Received: by 10.58.15.195 with HTTP; Fri, 19 Oct 2012 10:59:02 -0700 (PDT) In-Reply-To: References: Date: Fri, 19 Oct 2012 10:59:02 -0700 Message-ID: Subject: Re: constant CMS GC using CPU time From: Bryan Talbot To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b62461e4fc1e004cc6d4200 X-Gm-Message-State: ALoCoQlRHimxsherXSqXi8ztI7nqQi14F7pLL5j1x/rV1j7AWkfdf5TdZMWFVmhspwd+zRqLHTzd5OrFbTQadGQemtgfgEL0Y4RNwMBznblJRs8m/U/GOF2erPI2XSlxcbvjMufT3CrSRHXSVE3suvkI9wKu5QBSFcly94dg11Mr3sViuNni0K8= X-Virus-Checked: Checked by ClamAV on apache.org --047d7b62461e4fc1e004cc6d4200 Content-Type: text/plain; charset=UTF-8 ok, let me try asking the question a different way ... How does cassandra use memory and how can I plan how much is needed? I have a 1 GB memtable and 5 GB total heap and that's still not enough even though the number of concurrent connections and garbage generation rate is fairly low. If I were using mysql or oracle, I could compute how much memory could be used by N concurrent connections, how much is allocated for caching, temp spaces, etc. How can I do this for cassandra? Currently it seems like the memory used scales with the amount of bytes stored and not with how busy the server actually is. That's not such a good thing. -Bryan On Thu, Oct 18, 2012 at 11:06 AM, Bryan Talbot wrote: > In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11 > (64-bit), the nodes are often getting "stuck" in state where CMS > collections of the old space are constantly running. > > The JVM configuration is using the standard settings in cassandra-env -- > relevant settings are included below. The max heap is currently set to 5 > GB with 800MB for new size. I don't believe that the cluster is overly > busy and seems to be performing well enough other than this issue. When > nodes get into this state they never seem to leave it (by freeing up old > space memory) without restarting cassandra. They typically enter this > state while running "nodetool repair -pr" but once they start doing this, > restarting them only "fixes" it for a couple of hours. > > Compactions are completing and are generally not queued up. All CF are > using STCS. The busiest CF consumes about 100GB of space on disk, is write > heavy, and all columns have a TTL of 3 days. Overall, there are 41 CF > including those used for system keyspace and secondary indexes. The number > of SSTables per node currently varies from 185-212. > > Other than frequent log warnings about "GCInspector - Heap is 0.xxx > full..." and "StorageService - Flushing CFS(...) to relieve memory > pressure" there are no other log entries to indicate there is a problem. > > Does the memory needed vary depending on the amount of data stored? If > so, how can I predict how much jvm space is needed? I don't want to make > the heap too large as that's bad too. Maybe there's a memory leak related > to compaction that doesn't allow meta-data to be purged? > > > -Bryan > > > 12 GB of RAM in host with ~6 GB used by java and ~6 GB for OS and buffer > cache. > $> free -m > total used free shared buffers cached > Mem: 12001 11870 131 0 4 5778 > -/+ buffers/cache: 6087 5914 > Swap: 0 0 0 > > > jvm settings in cassandra-env > MAX_HEAP_SIZE="5G" > HEAP_NEWSIZE="800M" > > # GC tuning options > JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" > JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" > JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" > JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8" > JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=1" > JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=75" > JVM_OPTS="$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOnly" > JVM_OPTS="$JVM_OPTS -XX:+UseCompressedOops" > > > jstat shows about 12 full collections per minute with old heap usage > constantly over 75% so CMS is always over the > CMSInitiatingOccupancyFraction threshold. > > $> jstat -gcutil -t 22917 5000 4 > Timestamp S0 S1 E O P YGC YGCT FGC > FGCT GCT > 132063.0 34.70 0.00 26.03 82.29 59.88 21580 506.887 17523 > 3078.941 3585.829 > 132068.0 34.70 0.00 50.02 81.23 59.88 21580 506.887 17524 > 3079.220 3586.107 > 132073.1 0.00 24.92 46.87 81.41 59.88 21581 506.932 17525 > 3079.583 3586.515 > 132078.1 0.00 24.92 64.71 81.40 59.88 21581 506.932 17527 > 3079.853 3586.785 > > > Other hosts not currently experiencing the high CPU load have a heap less > than .75 full. > > $> jstat -gcutil -t 6063 5000 4 > Timestamp S0 S1 E O P YGC YGCT FGC > FGCT GCT > 520731.6 0.00 12.70 36.37 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > 520736.5 0.00 12.70 53.25 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > 520741.5 0.00 12.70 68.92 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > 520746.5 0.00 12.70 83.11 71.33 59.26 46453 1688.809 14785 > 2130.779 3819.588 > > > > --047d7b62461e4fc1e004cc6d4200 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable ok, let me try asking the question a different way ...

H= ow does cassandra use memory and how can I plan how much is needed? =C2=A0I= have a 1 GB memtable and 5 GB total heap and that's still not enough e= ven though the number of concurrent connections and garbage generation rate= is fairly low.

If I were using mysql or oracle, I could compute how mu= ch memory could be used by N concurrent connections, how much is allocated = for caching, temp spaces, etc. =C2=A0How can I do this for cassandra? =C2= =A0Currently it seems like the memory used scales with the amount of bytes = stored and not with how busy the server actually is. =C2=A0That's not s= uch a good thing.

-Bryan



On Thu, Oct 18, 2012 at 11:06 AM, Bryan Talbot <btalbot@a= eriagames.com> wrote:
In a 4 node cluster running Cassandra 1= .1.5 with sun jvm=C2=A01.6.0_29-b11 (64-bit), the nodes are often getting &= quot;stuck" in state where CMS collections of the old space are consta= ntly running. =C2=A0

The JVM configuration is using the standard settings in cassandra-env = -- relevant settings are included below. =C2=A0The max heap is currently se= t to 5 GB with 800MB for new size. =C2=A0I don't believe that the clust= er is overly busy and seems to be performing well enough other than this is= sue. =C2=A0When nodes get into this state they never seem to leave it (by f= reeing up old space memory) without restarting cassandra. =C2=A0They=C2=A0t= ypically enter this state while running "nodetool repair -pr" but= once they start doing this, restarting them only "fixes" it for = a couple of hours.

Compactions are completing and are=C2=A0generally=C2=A0= not queued up. =C2=A0All CF are using STCS. =C2=A0The busiest CF consumes a= bout 100GB of space on disk, is write heavy, and all columns have a TTL of = 3 days. =C2=A0Overall, there are 41 CF including those used for system keys= pace and secondary indexes. =C2=A0The number of SSTables per node currently= varies from 185-212.

Other than frequent log warnings about "GCInspector =C2=A0- Heap is 0.xxx full..." and "StorageService =C2= =A0- Flushing CFS(...) to relieve memory pressure" there are no= other log entries to indicate there is a problem.

Does the memory needed vary depending on the amount of = data stored? =C2=A0If so, how can I predict how much jvm space is needed? = =C2=A0I don't want to make the heap too large as that's bad too. = =C2=A0Maybe there's a memory leak related to compaction that doesn'= t allow meta-data to be purged?


-Bryan


12 GB of RAM in host with ~6 GB used by java and ~6 GB for OS and bu= ffer cache.
$>=C2=A0free= -m
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0total =C2=A0 =C2=A0 =C2=A0 used =C2=A0 =C2=A0 =C2=A0 free = =C2=A0 =C2=A0 shared =C2=A0 =C2=A0buffers =C2=A0 =C2=A0 cached
=
Mem: =C2=A0 =C2=A0 =C2=A0 =C2=A0= 12001 =C2=A0 =C2=A0 =C2=A011870 =C2=A0 =C2=A0 =C2=A0 =C2=A0131 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A04 =C2=A0 =C2=A0= =C2=A0 5778
-/+ buffers/cache: =C2=A0 =C2=A0= =C2=A0 6087 =C2=A0 =C2=A0 =C2=A0 5914
Swap: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A00 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00


jv= m settings in cassandra-env
= MAX_HEAP_SIZE=3D"5G"
HEAP_NEWSIZE=3D"800M"

# GC tuning options
JVM_OPTS=3D"$JVM_OPTS -XX:+UseP= arNewGC"=C2=A0
JVM_OPTS=3D"$JVM_OPTS -XX:+= UseConcMarkSweepGC"=C2=A0
JVM_OPTS=3D"$JVM_OPTS -XX:+CMSParallelRemarkEnabled"= =C2=A0
JVM_OPTS=3D"$JVM_OPTS -XX:S= urvivorRatio=3D8"=C2=A0
JVM_OPTS=3D"$JVM_OPTS -XX:MaxTenuringThreshold=3D1"
JVM_OPTS=3D"$JVM_OPTS -XX:CMSIni= tiatingOccupancyFraction=3D75"
JVM_OPTS=3D"$JVM_OPTS -XX:+UseCMSInitiatingOccupancyOn= ly"
JVM_OPTS=3D"$JVM_OPTS -XX:+= UseCompressedOops"


<= /font>
jstat shows about 12 full collections per minute with old heap usage consta= ntly over 75% so CMS is always over the CMSInitiatingOccupancyFraction thre= shold.

$> = jstat -gcutil -t 22917 5000 4
Timestamp =C2=A0 =C2=A0 =C2=A0 =C2=A0 S0 =C2=A0 =C2=A0 S1 =C2=A0 =C2= =A0 E =C2=A0 =C2=A0 =C2=A0O =C2=A0 =C2=A0 =C2=A0P =C2=A0 =C2=A0 YGC =C2=A0 = =C2=A0 YGCT =C2=A0 =C2=A0FGC =C2=A0 =C2=A0FGCT =C2=A0 =C2=A0 GCT =C2=A0=C2= =A0
=C2=A0 =C2=A0 =C2=A0 =C2=A0132063.0 =C2=A034.70 =C2=A0 0.00 = =C2=A026.03 =C2=A082.29 =C2=A059.88 =C2=A021580 =C2=A0506.887 17523 3078.94= 1 3585.829
=C2=A0 =C2=A0 =C2=A0 =C2=A0132068.0 =C2=A034.70 =C2=A0= 0.00 =C2=A050.02 =C2=A081.23 =C2=A059.88 =C2=A021580 =C2=A0506.887 17524 3= 079.220 3586.107
=C2=A0 =C2=A0 =C2=A0 =C2=A0132073.1 =C2=A0 0.00 =C2=A024.92 =C2=A046.8= 7 =C2=A081.41 =C2=A059.88 =C2=A021581 =C2=A0506.932 17525 3079.583 3586.515=
=C2=A0 =C2=A0 =C2=A0 =C2=A0132078.1 =C2=A0 0.00 =C2=A024.92 =C2= =A064.71 =C2=A081.40 =C2=A059.88 =C2=A021581 =C2=A0506.932 17527 3079.853 3= 586.785


Other hosts not currently experiencing the high CPU loa= d have a heap less than .75 full.

$> jstat -gcutil -t 6063 5000 4
Timestamp =C2=A0 =C2=A0 =C2=A0 = =C2=A0 S0 =C2=A0 =C2=A0 S1 =C2=A0 =C2=A0 E =C2=A0 =C2=A0 =C2=A0O =C2=A0 =C2= =A0 =C2=A0P =C2=A0 =C2=A0 YGC =C2=A0 =C2=A0 YGCT =C2=A0 =C2=A0FGC =C2=A0 = =C2=A0FGCT =C2=A0 =C2=A0 GCT
=C2=A0 =C2=A0 =C2=A0 =C2=A0520731.6 =C2=A0 0.00 =C2=A012.70 =C2=A0= 36.37 =C2=A071.33 =C2=A059.26 =C2=A046453 1688.809 14785 2130.779 3819.588<= /font>
=C2=A0 =C2=A0 =C2=A0 =C2=A052073= 6.5 =C2=A0 0.00 =C2=A012.70 =C2=A053.25 =C2=A071.33 =C2=A059.26 =C2=A046453= 1688.809 14785 2130.779 3819.588
=C2=A0 =C2=A0 =C2=A0 =C2=A0520741.5 =C2=A0 0.00 =C2=A012.70 = =C2=A068.92 =C2=A071.33 =C2=A059.26 =C2=A046453 1688.809 14785 2130.779 381= 9.588
=C2=A0 =C2=A0 =C2=A0 =C2=A052074= 6.5 =C2=A0 0.00 =C2=A012.70 =C2=A083.11 =C2=A071.33 =C2=A059.26 =C2=A046453= 1688.809 14785 2130.779 3819.588




--047d7b62461e4fc1e004cc6d4200--