Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1F336979B for ; Mon, 3 Oct 2011 20:24:32 +0000 (UTC) Received: (qmail 85213 invoked by uid 500); 3 Oct 2011 20:24:29 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 85164 invoked by uid 500); 3 Oct 2011 20:24:29 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 85151 invoked by uid 99); 3 Oct 2011 20:24:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Oct 2011 20:24:29 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL,TO_NO_BRKTS_PCNT X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.220.172] (HELO mail-vx0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Oct 2011 20:24:21 +0000 Received: by vcbfo11 with SMTP id fo11so4835294vcb.31 for ; Mon, 03 Oct 2011 13:24:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.65.33 with SMTP id u1mr394514vds.178.1317673440656; Mon, 03 Oct 2011 13:24:00 -0700 (PDT) Received: by 10.52.186.106 with HTTP; Mon, 3 Oct 2011 13:24:00 -0700 (PDT) In-Reply-To: References: Date: Mon, 3 Oct 2011 13:24:00 -0700 Message-ID: Subject: Re: cassandra performance degrades after 12 hours From: Chris Goffinet To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf3071ca8a5c3e7504ae6ac121 X-Virus-Checked: Checked by ClamAV on apache.org --20cf3071ca8a5c3e7504ae6ac121 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Yes look at cassandra.yaml there is a section about throttling compaction. You still *want* multi-threaded compaction. Throttling will occur across al= l threads. The reason being is that you don't want to get stuck compacting bigger files, while the smaller ones build up waiting for bigger compaction to finish. This will slowly degrade read performance. On Mon, Oct 3, 2011 at 1:19 PM, Ramesh Natarajan wrote= : > Thanks for the pointers. I checked the system and the iostat showed that > we are saturating the disk to 100%. The disk is SCSI device exposed by ES= Xi > and it is running on a dedicated lun as RAID10 (4 600GB 15k drives) > connected to ESX host via iSCSI. > > When I run compactionstats I see we are compacting a column family which > has about 10GB of data. During this time I also see dropped messages in t= he > system.log file. > > Since my io rates are constant in my tests I think the compaction is > throwing things off. Is there a way I can throttle compaction on cassand= ra? > Rather than run multiple compaction run at the same time, i would like = to > throttle it by io rate.. It is possible? > > If instead of having 5 big column families, if I create say 1000 each (50= 00 > total), do you think it will help me in this case? ( smaller files and so > smaller load on compaction ) > > Is it normal to have 5000 column families? > > thanks > Ramesh > > > > On Mon, Oct 3, 2011 at 2:50 PM, Chris Goffinet wrot= e: > >> Most likely what could be happening is you are running single threaded >> compaction. Look at the cassandra.yaml of how to enable multi-threaded >> compaction. As more data comes into the system, bigger files get created >> during compaction. You could be in a situation where you might be compac= ting >> at a higher bucket N level, and compactions build up at lower buckets. >> >> Run "nodetool -host localhost compactionstats" to get an idea of what's >> going on. >> >> >> On Mon, Oct 3, 2011 at 12:05 PM, Mohit Anchlia w= rote: >> >>> In order to understand what's going on you might want to first just do >>> write test, look at the results and then do just the read tests and >>> then do both read / write tests. >>> >>> Since you mentioned high update/deletes I should also ask your CL for >>> writes/reads? with high updates/delete + high CL I think one should >>> expect reads to slow down when sstables have not been compacted. >>> >>> You have 20G space and 17G is used by your process and I also see 36G >>> VIRT which I don't really understand why it's that high when swap is >>> disabled. Look at sar -r output too to make sure there are no swaps >>> occurring. Also, verify jna.jar is installed. >>> >>> On Mon, Oct 3, 2011 at 11:52 AM, Ramesh Natarajan >>> wrote: >>> > I will start another test run to collect these stats. Our test model = is >>> in >>> > the neighborhood of 4500 inserts, 8000 updates&deletes and 1500 read= s >>> every >>> > second across 6 servers. >>> > Can you elaborate more on reducing the heap space? Do you think it is= a >>> > problem with 17G RSS? >>> > thanks >>> > Ramesh >>> > >>> > >>> > On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia >>> > wrote: >>> >> >>> >> I am wondering if you are seeing issues because of more frequent >>> >> compactions kicking in. Is this primarily write ops or reads too? >>> >> During the period of test gather data like: >>> >> >>> >> 1. cfstats >>> >> 2. tpstats >>> >> 3. compactionstats >>> >> 4. netstats >>> >> 5. iostat >>> >> >>> >> You have RSS memory close to 17gb. Maybe someone can give further >>> >> advise if that could be because of mmap. You might want to lower you= r >>> >> heap size to 6-8G and see if that helps. >>> >> >>> >> Also, check if you have jna.jar deployed and you see malloc successf= ul >>> >> message in the logs. >>> >> >>> >> On Mon, Oct 3, 2011 at 10:36 AM, Ramesh Natarajan >> > >>> >> wrote: >>> >> > We have 5 CF. Attached is the output from the describe command. = We >>> >> > don't >>> >> > have row cache enabled. >>> >> > Thanks >>> >> > Ramesh >>> >> > Keyspace: MSA: >>> >> > Replication Strategy: org.apache.cassandra.locator.SimpleStrateg= y >>> >> > Durable Writes: true >>> >> > Options: [replication_factor:3] >>> >> > Column Families: >>> >> > ColumnFamily: admin >>> >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Ty= pe >>> >> > Default column value validator: >>> >> > org.apache.cassandra.db.marshal.UTF8Type >>> >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >>> >> > Row cache size / save period in seconds: 0.0/0 >>> >> > Key cache size / save period in seconds: 200000.0/14400 >>> >> > Memtable thresholds: 0.5671875/1440/121 (millions of >>> >> > ops/minutes/MB) >>> >> > GC grace seconds: 3600 >>> >> > Compaction min/max thresholds: 4/32 >>> >> > Read repair chance: 1.0 >>> >> > Replicate on write: true >>> >> > Built indexes: [] >>> >> > ColumnFamily: modseq >>> >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Ty= pe >>> >> > Default column value validator: >>> >> > org.apache.cassandra.db.marshal.UTF8Type >>> >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >>> >> > Row cache size / save period in seconds: 0.0/0 >>> >> > Key cache size / save period in seconds: 500000.0/14400 >>> >> > Memtable thresholds: 0.5671875/1440/121 (millions of >>> >> > ops/minutes/MB) >>> >> > GC grace seconds: 3600 >>> >> > Compaction min/max thresholds: 4/32 >>> >> > Read repair chance: 1.0 >>> >> > Replicate on write: true >>> >> > Built indexes: [] >>> >> > ColumnFamily: msgid >>> >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Ty= pe >>> >> > Default column value validator: >>> >> > org.apache.cassandra.db.marshal.UTF8Type >>> >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >>> >> > Row cache size / save period in seconds: 0.0/0 >>> >> > Key cache size / save period in seconds: 500000.0/14400 >>> >> > Memtable thresholds: 0.5671875/1440/121 (millions of >>> >> > ops/minutes/MB) >>> >> > GC grace seconds: 864000 >>> >> > Compaction min/max thresholds: 4/32 >>> >> > Read repair chance: 1.0 >>> >> > Replicate on write: true >>> >> > Built indexes: [] >>> >> > ColumnFamily: participants >>> >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Ty= pe >>> >> > Default column value validator: >>> >> > org.apache.cassandra.db.marshal.UTF8Type >>> >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >>> >> > Row cache size / save period in seconds: 0.0/0 >>> >> > Key cache size / save period in seconds: 500000.0/14400 >>> >> > Memtable thresholds: 0.5671875/1440/121 (millions of >>> >> > ops/minutes/MB) >>> >> > GC grace seconds: 3600 >>> >> > Compaction min/max thresholds: 4/32 >>> >> > Read repair chance: 1.0 >>> >> > Replicate on write: true >>> >> > Built indexes: [] >>> >> > ColumnFamily: uid >>> >> > Key Validation Class: org.apache.cassandra.db.marshal.UTF8Ty= pe >>> >> > Default column value validator: >>> >> > org.apache.cassandra.db.marshal.UTF8Type >>> >> > Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type >>> >> > Row cache size / save period in seconds: 0.0/0 >>> >> > Key cache size / save period in seconds: 2000000.0/14400 >>> >> > Memtable thresholds: 0.4/1440/121 (millions of ops/minutes/M= B) >>> >> > GC grace seconds: 3600 >>> >> > Compaction min/max thresholds: 4/32 >>> >> > Read repair chance: 1.0 >>> >> > Replicate on write: true >>> >> > Built indexes: [] >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > On Mon, Oct 3, 2011 at 12:26 PM, Mohit Anchlia < >>> mohitanchlia@gmail.com> >>> >> > wrote: >>> >> >> >>> >> >> On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan < >>> ramesh25@gmail.com> >>> >> >> wrote: >>> >> >> > I am running a cassandra cluster of 6 nodes running RHEL6 >>> >> >> > virtualized >>> >> >> > by >>> >> >> > ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. >>> Our >>> >> >> > test >>> >> >> > setup performs about 3000 inserts per second. The cassandra >>> data >>> >> >> > partition >>> >> >> > is on a XFS filesystem mounted with options >>> >> >> > (noatime,nodiratime,nobarrier,logbufs=3D8). We have no swap ena= bled >>> on >>> >> >> > the >>> >> >> > VMs >>> >> >> > and the vm.swappiness set to 0. To avoid any contention issues >>> our >>> >> >> > cassandra >>> >> >> > VMs are not running any other application other than cassandra. >>> >> >> > The test runs fine for about 12 hours or so. After that the >>> >> >> > performance >>> >> >> > starts to degrade to about 1500 inserts per sec. By 18-20 hours >>> the >>> >> >> > inserts >>> >> >> > go down to 300 per sec. >>> >> >> > if i do a truncate, it starts clean, runs for a few hours (not = as >>> >> >> > clean >>> >> >> > as >>> >> >> > rebooting). >>> >> >> > We find a direct correlation between kswapd kicking in after 12 >>> hours >>> >> >> > or >>> >> >> > so >>> >> >> > and the performance degradation. If i look at the cached memo= ry >>> it >>> >> >> > is >>> >> >> > close to 10G. I am not getting a OOM error in cassandra. So >>> looks >>> >> >> > like >>> >> >> > we >>> >> >> > are not running out of memory. Can some one explain if we can >>> >> >> > optimize >>> >> >> > this >>> >> >> > so that kswapd doesn't kick in. >>> >> >> > >>> >> >> > Our top output shows >>> >> >> > top - 16:23:54 up 2 days, 23:17, 4 users, load average: 2.21, >>> 2.08, >>> >> >> > 2.02 >>> >> >> > Tasks: 213 total, 1 running, 212 sleeping, 0 stopped, 0 >>> zombie >>> >> >> > Cpu(s): 1.6%us, 0.8%sy, 0.0%ni, 90.9%id, 6.3%wa, 0.0%hi, >>> >> >> > 0.2%si, >>> >> >> > 0.0%st >>> >> >> > Mem: 20602812k total, 20320424k used, 282388k free, 1020= k >>> >> >> > buffers >>> >> >> > Swap: 0k total, 0k used, 0k free, 10145516= k >>> >> >> > cached >>> >> >> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>> COMMAND >>> >> >> > >>> >> >> > 2586 root 20 0 36.3g 17g 8.4g S 32.1 88.9 8496:37 >>> java >>> >> >> > >>> >> >> > java output >>> >> >> > root 2453 1 99 Sep30 pts/0 9-13:51:38 java -ea >>> >> >> > -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar >>> >> >> > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=3D42 -Xms1005= 9M >>> >> >> > -Xmx10059M >>> >> >> > -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k >>> -XX:+UseParNewGC >>> >> >> > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled >>> >> >> > -XX:SurvivorRatio=3D8 >>> >> >> > -XX:MaxTenuringThreshold=3D1 -XX:CMSInitiatingOccupancyFraction= =3D75 >>> >> >> > -XX:+UseCMSInitiatingOccupancyOnly >>> -Djava.net.preferIPv4Stack=3Dtrue >>> >> >> > -Dcom.sun.management.jmxremote.port=3D7199 >>> >> >> > -Dcom.sun.management.jmxremote.ssl=3Dfalse >>> >> >> > -Dcom.sun.management.jmxremote.authenticate=3Dfalse >>> >> >> > -Djava.rmi.server.hostname=3D10.19.104.14 >>> >> >> > -Djava.net.preferIPv4Stack=3Dtrue >>> >> >> > -Dlog4j.configuration=3Dlog4j-server.properties >>> >> >> > -Dlog4j.defaultInitOverride=3Dtrue -cp >>> >> >> > >>> >> >> > >>> >> >> > >>> ./apache-cassandra-0.8.6/bin/../conf:./apache-cassandra-0.8.6/bin/../bu= ild/classes/main:./apache-cassandra-0.8.6/bin/../build/classes/thrift:./apa= che-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-cassandra-0.8.6/bin/.= ./lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/apache= -cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-= fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-sources-fixes.jar:= ./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:./apache-cassandra-= 0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandra-0.8.6/bin/../lib/= commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-l= ang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurrentlinkedhashmap-lru= -1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.jar:./apache-cassand= ra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-cassandra-0.8.6/bin/.= ./lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jackso= n-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar:.= /apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./apache-cassandra-0.8.= 6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/libthr= ift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.2.16.jar:./apache-c= assandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cassandra-0.8.6/bin/..= /lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j.jar:./apache-ca= ssandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra-0.8.6/bin/../lib/m= x4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rimpl.jar:./apache-= cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassandra-0.8.6/bin/../li= b/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/servlet-api-2.5-200812= 11.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6.1.jar:./apache-cas= sandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache-cassandra-0.8.6/bi= n/../lib/snakeyaml-1.6.jar >>> >> >> > org.apache.cassandra.thrift.CassandraDaemon >>> >> >> > >>> >> >> > >>> >> >> > Ring output >>> >> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool -h >>> >> >> > 127.0.0.1 >>> >> >> > ring >>> >> >> > Address DC Rack Status State Load >>> >> >> > Owns >>> >> >> > Token >>> >> >> > >>> >> >> > 141784319550391026443072753096570088105 >>> >> >> > 10.19.104.11 datacenter1 rack1 Up Normal 19.92 GB >>> >> >> > 16.67% 0 >>> >> >> > 10.19.104.12 datacenter1 rack1 Up Normal 19.3 GB >>> >> >> > 16.67% 28356863910078205288614550619314017621 >>> >> >> > 10.19.104.13 datacenter1 rack1 Up Normal 18.57 GB >>> >> >> > 16.67% 56713727820156410577229101238628035242 >>> >> >> > 10.19.104.14 datacenter1 rack1 Up Normal 19.34 GB >>> >> >> > 16.67% 85070591730234615865843651857942052863 >>> >> >> > 10.19.105.11 datacenter1 rack1 Up Normal 19.88 GB >>> >> >> > 16.67% 113427455640312821154458202477256070484 >>> >> >> > 10.19.105.12 datacenter1 rack1 Up Normal 20 GB >>> >> >> > 16.67% 141784319550391026443072753096570088105 >>> >> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# >>> >> >> >>> >> >> How many CFs? can you describe CF and post the configuration? Do >>> you >>> >> >> have row cache enabled? >>> >> > >>> >> > >>> > >>> > >>> >> >> > --20cf3071ca8a5c3e7504ae6ac121 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Yes look at cassandra.yaml there is a section about throttling compaction. = You still *want* multi-threaded compaction. Throttling will occur across al= l threads. The reason being is that you don't want to get stuck compact= ing bigger files, while the smaller ones build up waiting for bigger compac= tion to finish. This will slowly degrade read performance.

On Mon, Oct 3, 2011 at 1:19 PM, Ramesh Natar= ajan <ramesh25@g= mail.com> wrote:
Thanks for the pointers. =A0I checked the system and the iostat showed= that we are saturating the disk to 100%. The disk is SCSI device exposed b= y ESXi and it is running on a dedicated lun as RAID10 (4 600GB 15k drives) = connected to ESX host via iSCSI. =A0

When I run compactionstats I see we are compacting a co= lumn family which has about 10GB of data. During this time I also see dropp= ed messages in the system.log file.=A0

Since = my io rates are constant in my tests I think the compaction is throwing thi= ngs off. =A0Is there a way I can throttle compaction on cassandra? =A0 Rath= er than run multiple compaction run at the same time, i would like to throt= tle it by io rate.. It is possible?

If instead of having 5 big column families, if I create= say 1000 each (5000 total), do you think it will help me in this case? ( s= maller files and so smaller load on compaction )

Is it normal to have 5000 column families?

thanks<= /div>
Ramesh



On Mon, Oct 3, 2011 at 2:50 PM, Chris G= offinet <cg@chrisgoffinet.com> wrote:
Most likely what could be happening is you a= re running single threaded compaction. Look at the cassandra.yaml of how to= enable multi-threaded compaction. As more data comes into the system, bigg= er files get created during compaction. You could be in a situation where y= ou might be compacting at a higher bucket N level, and compactions build up= at lower buckets.

Run "nodetool -host localhost compactionstats" to = get an idea of what's going on.


On Mon, Oct 3, 2011 at 12:05 PM, Mohit Anchlia <mo= hitanchlia@gmail.com> wrote:
In order to understand what's going on y= ou might want to first just do
write test, look at the results and then do just the read tests and
then do both read / write tests.

Since you mentioned high update/deletes I should also ask your CL for
writes/reads? with high updates/delete + high CL I think one should
expect reads to slow down when sstables have not been compacted.

You have 20G space and 17G is used by your process and I also see 36G
VIRT which I don't really understand why it's that high when swap i= s
disabled. Look at sar -r output too to make sure there are no swaps
occurring. Also, verify jna.jar is installed.

On Mon, Oct 3, 2011 at 11:52 AM, Ramesh Natarajan <ramesh25@gmail.com> wrote:
> I will start another test run to collect these stats. Our test model i= s in
> the neighborhood of =A04500 inserts, 8000 updates&deletes and 1500= reads every
> second across 6 servers.
> Can you elaborate more on reducing the heap space? Do you think it is = a
> problem with 17G RSS?
> thanks
> Ramesh
>
>
> On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia <mohitanchlia@gmail.com>
> wrote:
>>
>> I am wondering if you are seeing issues because of more frequent >> compactions kicking in. Is this primarily write ops or reads too?<= br> >> During the period of test gather data like:
>>
>> 1. cfstats
>> 2. tpstats
>> 3. compactionstats
>> 4. netstats
>> 5. iostat
>>
>> You have RSS memory close to 17gb. Maybe someone can give further<= br> >> advise if that could be because of mmap. You might want to lower y= our
>> heap size to 6-8G and see if that helps.
>>
>> Also, check if you have jna.jar deployed and you see malloc succes= sful
>> message in the logs.
>>
>> On Mon, Oct 3, 2011 at 10:36 AM, Ramesh Natarajan <ramesh25@gmail.com>
>> wrote:
>> > We have 5 CF. =A0Attached is the output from the describe com= mand. =A0We
>> > don't
>> > have row cache enabled.
>> > Thanks
>> > Ramesh
>> > Keyspace: MSA:
>> > =A0 Replication Strategy: org.apache.cassandra.locator.Simple= Strategy
>> > =A0 Durable Writes: true
>> > =A0 =A0 Options: [replication_factor:3]
>> > =A0 Column Families:
>> > =A0 =A0 ColumnFamily: admin
>> > =A0 =A0 =A0 Key Validation Class: org.apache.cassandra.db.mar= shal.UTF8Type
>> > =A0 =A0 =A0 Default column value validator:
>> > org.apache.cassandra.db.marshal.UTF8Type
>> > =A0 =A0 =A0 Columns sorted by: org.apache.cassandra.db.marsha= l.UTF8Type
>> > =A0 =A0 =A0 Row cache size / save period in seconds: 0.0/0 >> > =A0 =A0 =A0 Key cache size / save period in seconds: 200000.0= /14400
>> > =A0 =A0 =A0 Memtable thresholds: 0.5671875/1440/121 (millions= of
>> > ops/minutes/MB)
>> > =A0 =A0 =A0 GC grace seconds: 3600
>> > =A0 =A0 =A0 Compaction min/max thresholds: 4/32
>> > =A0 =A0 =A0 Read repair chance: 1.0
>> > =A0 =A0 =A0 Replicate on write: true
>> > =A0 =A0 =A0 Built indexes: []
>> > =A0 =A0 ColumnFamily: modseq
>> > =A0 =A0 =A0 Key Validation Class: org.apache.cassandra.db.mar= shal.UTF8Type
>> > =A0 =A0 =A0 Default column value validator:
>> > org.apache.cassandra.db.marshal.UTF8Type
>> > =A0 =A0 =A0 Columns sorted by: org.apache.cassandra.db.marsha= l.UTF8Type
>> > =A0 =A0 =A0 Row cache size / save period in seconds: 0.0/0 >> > =A0 =A0 =A0 Key cache size / save period in seconds: 500000.0= /14400
>> > =A0 =A0 =A0 Memtable thresholds: 0.5671875/1440/121 (millions= of
>> > ops/minutes/MB)
>> > =A0 =A0 =A0 GC grace seconds: 3600
>> > =A0 =A0 =A0 Compaction min/max thresholds: 4/32
>> > =A0 =A0 =A0 Read repair chance: 1.0
>> > =A0 =A0 =A0 Replicate on write: true
>> > =A0 =A0 =A0 Built indexes: []
>> > =A0 =A0 ColumnFamily: msgid
>> > =A0 =A0 =A0 Key Validation Class: org.apache.cassandra.db.mar= shal.UTF8Type
>> > =A0 =A0 =A0 Default column value validator:
>> > org.apache.cassandra.db.marshal.UTF8Type
>> > =A0 =A0 =A0 Columns sorted by: org.apache.cassandra.db.marsha= l.UTF8Type
>> > =A0 =A0 =A0 Row cache size / save period in seconds: 0.0/0 >> > =A0 =A0 =A0 Key cache size / save period in seconds: 500000.0= /14400
>> > =A0 =A0 =A0 Memtable thresholds: 0.5671875/1440/121 (millions= of
>> > ops/minutes/MB)
>> > =A0 =A0 =A0 GC grace seconds: 864000
>> > =A0 =A0 =A0 Compaction min/max thresholds: 4/32
>> > =A0 =A0 =A0 Read repair chance: 1.0
>> > =A0 =A0 =A0 Replicate on write: true
>> > =A0 =A0 =A0 Built indexes: []
>> > =A0 =A0 ColumnFamily: participants
>> > =A0 =A0 =A0 Key Validation Class: org.apache.cassandra.db.mar= shal.UTF8Type
>> > =A0 =A0 =A0 Default column value validator:
>> > org.apache.cassandra.db.marshal.UTF8Type
>> > =A0 =A0 =A0 Columns sorted by: org.apache.cassandra.db.marsha= l.UTF8Type
>> > =A0 =A0 =A0 Row cache size / save period in seconds: 0.0/0 >> > =A0 =A0 =A0 Key cache size / save period in seconds: 500000.0= /14400
>> > =A0 =A0 =A0 Memtable thresholds: 0.5671875/1440/121 (millions= of
>> > ops/minutes/MB)
>> > =A0 =A0 =A0 GC grace seconds: 3600
>> > =A0 =A0 =A0 Compaction min/max thresholds: 4/32
>> > =A0 =A0 =A0 Read repair chance: 1.0
>> > =A0 =A0 =A0 Replicate on write: true
>> > =A0 =A0 =A0 Built indexes: []
>> > =A0 =A0 ColumnFamily: uid
>> > =A0 =A0 =A0 Key Validation Class: org.apache.cassandra.db.mar= shal.UTF8Type
>> > =A0 =A0 =A0 Default column value validator:
>> > org.apache.cassandra.db.marshal.UTF8Type
>> > =A0 =A0 =A0 Columns sorted by: org.apache.cassandra.db.marsha= l.UTF8Type
>> > =A0 =A0 =A0 Row cache size / save period in seconds: 0.0/0 >> > =A0 =A0 =A0 Key cache size / save period in seconds: 2000000.= 0/14400
>> > =A0 =A0 =A0 Memtable thresholds: 0.4/1440/121 (millions of op= s/minutes/MB)
>> > =A0 =A0 =A0 GC grace seconds: 3600
>> > =A0 =A0 =A0 Compaction min/max thresholds: 4/32
>> > =A0 =A0 =A0 Read repair chance: 1.0
>> > =A0 =A0 =A0 Replicate on write: true
>> > =A0 =A0 =A0 Built indexes: []
>> >
>> >
>> >
>> >
>> > On Mon, Oct 3, 2011 at 12:26 PM, Mohit Anchlia <mohitanchlia@gmail.com>
>> > wrote:
>> >>
>> >> On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan <
ramesh25@gmail.com= >
>> >> wrote:
>> >> > I am running a cassandra cluster of =A06 nodes runni= ng RHEL6
>> >> > virtualized
>> >> > by
>> >> > ESXi 5.0. =A0Each VM is configured with 20GB of ram = and 12 cores. Our
>> >> > test
>> >> > setup performs about 3000 =A0inserts per second. =A0= The cassandra data
>> >> > partition
>> >> > is on a XFS filesystem mounted with options
>> >> > (noatime,nodiratime,nobarrier,logbufs=3D8).=A0We hav= e no swap enabled on
>> >> > the
>> >> > VMs
>> >> > and the vm.swappiness set to 0. To avoid any content= ion issues our
>> >> > cassandra
>> >> > VMs are not running any other application other than= cassandra.
>> >> > The test runs fine for about 12 hours or so. After t= hat the
>> >> > performance
>> >> > starts to degrade to about 1500 inserts per sec. By = 18-20 hours the
>> >> > inserts
>> >> > go down to 300 per sec.
>> >> > if i do a truncate, it starts clean, runs for a few = hours (not as
>> >> > clean
>> >> > as
>> >> > rebooting).
>> >> > We find a direct correlation between kswapd kicking = in after 12 hours
>> >> > or
>> >> > so
>> >> > and the performance degradation. =A0 If i look at th= e cached memory it
>> >> > is
>> >> > close to 10G. =A0I am not getting a OOM error in cas= sandra. So looks
>> >> > like
>> >> > we
>> >> > are not running out of memory. Can some one explain = if we can
>> >> > optimize
>> >> > this
>> >> > so that kswapd doesn't kick in.
>> >> >
>> >> > Our top output shows
>> >> > top - 16:23:54 up 2 days, 23:17, =A04 users, =A0load= average: 2.21, 2.08,
>> >> > 2.02
>> >> > Tasks: 213 total, =A0 1 running, 212 sleeping, =A0 0= stopped, =A0 0 zombie
>> >> > Cpu(s): =A01.6%us, =A00.8%sy, =A00.0%ni, 90.9%id, = =A06.3%wa, =A00.0%hi,
>> >> > =A00.2%si,
>> >> > =A00.0%st
>> >> > Mem: =A020602812k total, 20320424k used, =A0 282388k= free, =A0 =A0 1020k
>> >> > buffers
>> >> > Swap: =A0 =A0 =A0 =A00k total, =A0 =A0 =A0 =A00k use= d, =A0 =A0 =A0 =A00k free, 10145516k
>> >> > cached
>> >> > =A0 PID USER =A0 =A0 =A0PR =A0NI =A0VIRT =A0RES =A0S= HR S %CPU %MEM =A0 =A0TIME+ =A0COMMAND
>> >> >
>> >> > =A02586 root =A0 =A0 =A020 =A0 0 36.3g =A017g 8.4g S= 32.1 88.9 =A0 8496:37 java
>> >> >
>> >> > java output
>> >> > root =A0 =A0 =A02453 =A0 =A0 1 99 Sep30 pts/0 =A0 = =A09-13:51:38 java -ea
>> >> > -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-= 0.2.2.jar
>> >> > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=3D= 42 -Xms10059M
>> >> > -Xmx10059M
>> >> > -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -= XX:+UseParNewGC
>> >> > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnable= d
>> >> > -XX:SurvivorRatio=3D8
>> >> > -XX:MaxTenuringThreshold=3D1 -XX:CMSInitiatingOccupa= ncyFraction=3D75
>> >> > -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.prefer= IPv4Stack=3Dtrue
>> >> > -Dcom.sun.management.jmxremote.port=3D7199
>> >> > -Dcom.sun.management.jmxremote.ssl=3Dfalse
>> >> > -Dcom.sun.management.jmxremote.authenticate=3Dfalse<= br> >> >> > -Djava.rmi.server.hostname=3D10.19.104.14
>> >> > -Djava.net.preferIPv4Stack=3Dtrue
>> >> > -Dlog4j.configuration=3Dlog4j-server.properties
>> >> > -Dlog4j.defaultInitOverride=3Dtrue -cp
>> >> >
>> >> >
>> >> > ./apache-cassandra-0.8.6/bin/../conf:./apache-cassan= dra-0.8.6/bin/../build/classes/main:./apache-cassandra-0.8.6/bin/../build/c= lasses/thrift:./apache-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-ca= ssandra-0.8.6/bin/../lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.= 6/bin/../lib/apache-cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin= /../lib/avro-1.4.0-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0= -sources-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:= ./apache-cassandra-0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandr= a-0.8.6/bin/../lib/commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/b= in/../lib/commons-lang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurre= ntlinkedhashmap-lru-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.j= ar:./apache-cassandra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-ca= ssandra-0.8.6/bin/../lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.= 6/bin/../lib/jackson-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../l= ib/jamm-0.2.2.jar:./apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./ap= ache-cassandra-0.8.6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.= 6/bin/../lib/libthrift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.= 2.16.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cas= sandra-0.8.6/bin/../lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/m= x4j.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra= -0.8.6/bin/../lib/mx4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-= rimpl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassan= dra-0.8.6/bin/../lib/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/ser= vlet-api-2.5-20081211.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6= .1.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache= -cassandra-0.8.6/bin/../lib/snakeyaml-1.6.jar
>> >> > org.apache.cassandra.thrift.CassandraDaemon
>> >> >
>> >> >
>> >> > Ring output
>> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nod= etool -h
>> >> > 127.0.0.1
>> >> > ring
>> >> > Address =A0 =A0 =A0 =A0 DC =A0 =A0 =A0 =A0 =A0Rack = =A0 =A0 =A0 =A0Status State =A0 Load
>> >> > =A0Owns
>> >> > =A0 =A0Token
>> >> >
>> >> > =A0 =A0141784319550391026443072753096570088105
>> >> > 10.19.104.11 =A0 =A0datacenter1 rack1 =A0 =A0 =A0 Up= =A0 =A0 Normal =A019.92 GB
>> >> > =A016.67% =A00
>> >> > 10.19.104.12 =A0 =A0datacenter1 rack1 =A0 =A0 =A0 Up= =A0 =A0 Normal =A019.3 GB
>> >> > 16.67% =A028356863910078205288614550619314017621
>> >> > 10.19.104.13 =A0 =A0datacenter1 rack1 =A0 =A0 =A0 Up= =A0 =A0 Normal =A018.57 GB
>> >> > =A016.67% =A056713727820156410577229101238628035242<= br> >> >> > 10.19.104.14 =A0 =A0datacenter1 rack1 =A0 =A0 =A0 Up= =A0 =A0 Normal =A019.34 GB
>> >> > =A016.67% =A085070591730234615865843651857942052863<= br> >> >> > 10.19.105.11 =A0 =A0datacenter1 rack1 =A0 =A0 =A0 Up= =A0 =A0 Normal =A019.88 GB
>> >> > =A016.67% =A0113427455640312821154458202477256070484=
>> >> > 10.19.105.12 =A0 =A0datacenter1 rack1 =A0 =A0 =A0 Up= =A0 =A0 Normal =A020 GB
>> >> > 16.67% =A0141784319550391026443072753096570088105 >> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]#
>> >>
>> >> How many CFs? can you describe CF and post the configurat= ion? Do you
>> >> have row cache enabled?
>> >
>> >
>
>



--20cf3071ca8a5c3e7504ae6ac121--