incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramesh Natarajan <rames...@gmail.com>
Subject Re: cassandra performance degrades after 12 hours
Date Mon, 03 Oct 2011 20:19:25 GMT
Thanks for the pointers.  I checked the system and the iostat showed that we
are saturating the disk to 100%. The disk is SCSI device exposed by ESXi and
it is running on a dedicated lun as RAID10 (4 600GB 15k drives) connected to
ESX host via iSCSI.

When I run compactionstats I see we are compacting a column family which has
about 10GB of data. During this time I also see dropped messages in the
system.log file.

Since my io rates are constant in my tests I think the compaction is
throwing things off.  Is there a way I can throttle compaction on cassandra?
  Rather than run multiple compaction run at the same time, i would like to
throttle it by io rate.. It is possible?

If instead of having 5 big column families, if I create say 1000 each (5000
total), do you think it will help me in this case? ( smaller files and so
smaller load on compaction )

Is it normal to have 5000 column families?

thanks
Ramesh



On Mon, Oct 3, 2011 at 2:50 PM, Chris Goffinet <cg@chrisgoffinet.com> wrote:

> Most likely what could be happening is you are running single threaded
> compaction. Look at the cassandra.yaml of how to enable multi-threaded
> compaction. As more data comes into the system, bigger files get created
> during compaction. You could be in a situation where you might be compacting
> at a higher bucket N level, and compactions build up at lower buckets.
>
> Run "nodetool -host localhost compactionstats" to get an idea of what's
> going on.
>
>
> On Mon, Oct 3, 2011 at 12:05 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:
>
>> In order to understand what's going on you might want to first just do
>> write test, look at the results and then do just the read tests and
>> then do both read / write tests.
>>
>> Since you mentioned high update/deletes I should also ask your CL for
>> writes/reads? with high updates/delete + high CL I think one should
>> expect reads to slow down when sstables have not been compacted.
>>
>> You have 20G space and 17G is used by your process and I also see 36G
>> VIRT which I don't really understand why it's that high when swap is
>> disabled. Look at sar -r output too to make sure there are no swaps
>> occurring. Also, verify jna.jar is installed.
>>
>> On Mon, Oct 3, 2011 at 11:52 AM, Ramesh Natarajan <ramesh25@gmail.com>
>> wrote:
>> > I will start another test run to collect these stats. Our test model is
>> in
>> > the neighborhood of  4500 inserts, 8000 updates&deletes and 1500 reads
>> every
>> > second across 6 servers.
>> > Can you elaborate more on reducing the heap space? Do you think it is a
>> > problem with 17G RSS?
>> > thanks
>> > Ramesh
>> >
>> >
>> > On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia <mohitanchlia@gmail.com>
>> > wrote:
>> >>
>> >> I am wondering if you are seeing issues because of more frequent
>> >> compactions kicking in. Is this primarily write ops or reads too?
>> >> During the period of test gather data like:
>> >>
>> >> 1. cfstats
>> >> 2. tpstats
>> >> 3. compactionstats
>> >> 4. netstats
>> >> 5. iostat
>> >>
>> >> You have RSS memory close to 17gb. Maybe someone can give further
>> >> advise if that could be because of mmap. You might want to lower your
>> >> heap size to 6-8G and see if that helps.
>> >>
>> >> Also, check if you have jna.jar deployed and you see malloc successful
>> >> message in the logs.
>> >>
>> >> On Mon, Oct 3, 2011 at 10:36 AM, Ramesh Natarajan <ramesh25@gmail.com>
>> >> wrote:
>> >> > We have 5 CF.  Attached is the output from the describe command.  We
>> >> > don't
>> >> > have row cache enabled.
>> >> > Thanks
>> >> > Ramesh
>> >> > Keyspace: MSA:
>> >> >   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>> >> >   Durable Writes: true
>> >> >     Options: [replication_factor:3]
>> >> >   Column Families:
>> >> >     ColumnFamily: admin
>> >> >       Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Default column value validator:
>> >> > org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Row cache size / save period in seconds: 0.0/0
>> >> >       Key cache size / save period in seconds: 200000.0/14400
>> >> >       Memtable thresholds: 0.5671875/1440/121 (millions of
>> >> > ops/minutes/MB)
>> >> >       GC grace seconds: 3600
>> >> >       Compaction min/max thresholds: 4/32
>> >> >       Read repair chance: 1.0
>> >> >       Replicate on write: true
>> >> >       Built indexes: []
>> >> >     ColumnFamily: modseq
>> >> >       Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Default column value validator:
>> >> > org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Row cache size / save period in seconds: 0.0/0
>> >> >       Key cache size / save period in seconds: 500000.0/14400
>> >> >       Memtable thresholds: 0.5671875/1440/121 (millions of
>> >> > ops/minutes/MB)
>> >> >       GC grace seconds: 3600
>> >> >       Compaction min/max thresholds: 4/32
>> >> >       Read repair chance: 1.0
>> >> >       Replicate on write: true
>> >> >       Built indexes: []
>> >> >     ColumnFamily: msgid
>> >> >       Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Default column value validator:
>> >> > org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Row cache size / save period in seconds: 0.0/0
>> >> >       Key cache size / save period in seconds: 500000.0/14400
>> >> >       Memtable thresholds: 0.5671875/1440/121 (millions of
>> >> > ops/minutes/MB)
>> >> >       GC grace seconds: 864000
>> >> >       Compaction min/max thresholds: 4/32
>> >> >       Read repair chance: 1.0
>> >> >       Replicate on write: true
>> >> >       Built indexes: []
>> >> >     ColumnFamily: participants
>> >> >       Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Default column value validator:
>> >> > org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Row cache size / save period in seconds: 0.0/0
>> >> >       Key cache size / save period in seconds: 500000.0/14400
>> >> >       Memtable thresholds: 0.5671875/1440/121 (millions of
>> >> > ops/minutes/MB)
>> >> >       GC grace seconds: 3600
>> >> >       Compaction min/max thresholds: 4/32
>> >> >       Read repair chance: 1.0
>> >> >       Replicate on write: true
>> >> >       Built indexes: []
>> >> >     ColumnFamily: uid
>> >> >       Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Default column value validator:
>> >> > org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type
>> >> >       Row cache size / save period in seconds: 0.0/0
>> >> >       Key cache size / save period in seconds: 2000000.0/14400
>> >> >       Memtable thresholds: 0.4/1440/121 (millions of ops/minutes/MB)
>> >> >       GC grace seconds: 3600
>> >> >       Compaction min/max thresholds: 4/32
>> >> >       Read repair chance: 1.0
>> >> >       Replicate on write: true
>> >> >       Built indexes: []
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > On Mon, Oct 3, 2011 at 12:26 PM, Mohit Anchlia <
>> mohitanchlia@gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan <
>> ramesh25@gmail.com>
>> >> >> wrote:
>> >> >> > I am running a cassandra cluster of  6 nodes running RHEL6
>> >> >> > virtualized
>> >> >> > by
>> >> >> > ESXi 5.0.  Each VM is configured with 20GB of ram and 12 cores.
>> Our
>> >> >> > test
>> >> >> > setup performs about 3000  inserts per second.  The cassandra
data
>> >> >> > partition
>> >> >> > is on a XFS filesystem mounted with options
>> >> >> > (noatime,nodiratime,nobarrier,logbufs=8). We have no swap
enabled
>> on
>> >> >> > the
>> >> >> > VMs
>> >> >> > and the vm.swappiness set to 0. To avoid any contention issues
our
>> >> >> > cassandra
>> >> >> > VMs are not running any other application other than cassandra.
>> >> >> > The test runs fine for about 12 hours or so. After that the
>> >> >> > performance
>> >> >> > starts to degrade to about 1500 inserts per sec. By 18-20
hours
>> the
>> >> >> > inserts
>> >> >> > go down to 300 per sec.
>> >> >> > if i do a truncate, it starts clean, runs for a few hours
(not as
>> >> >> > clean
>> >> >> > as
>> >> >> > rebooting).
>> >> >> > We find a direct correlation between kswapd kicking in after
12
>> hours
>> >> >> > or
>> >> >> > so
>> >> >> > and the performance degradation.   If i look at the cached
memory
>> it
>> >> >> > is
>> >> >> > close to 10G.  I am not getting a OOM error in cassandra.
So looks
>> >> >> > like
>> >> >> > we
>> >> >> > are not running out of memory. Can some one explain if we
can
>> >> >> > optimize
>> >> >> > this
>> >> >> > so that kswapd doesn't kick in.
>> >> >> >
>> >> >> > Our top output shows
>> >> >> > top - 16:23:54 up 2 days, 23:17,  4 users,  load average:
2.21,
>> 2.08,
>> >> >> > 2.02
>> >> >> > Tasks: 213 total,   1 running, 212 sleeping,   0 stopped,
  0
>> zombie
>> >> >> > Cpu(s):  1.6%us,  0.8%sy,  0.0%ni, 90.9%id,  6.3%wa,  0.0%hi,
>> >> >> >  0.2%si,
>> >> >> >  0.0%st
>> >> >> > Mem:  20602812k total, 20320424k used,   282388k free,   
 1020k
>> >> >> > buffers
>> >> >> > Swap:        0k total,        0k used,        0k free, 10145516k
>> >> >> > cached
>> >> >> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
>>  COMMAND
>> >> >> >
>> >> >> >  2586 root      20   0 36.3g  17g 8.4g S 32.1 88.9   8496:37
java
>> >> >> >
>> >> >> > java output
>> >> >> > root      2453     1 99 Sep30 pts/0    9-13:51:38 java -ea
>> >> >> > -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar
>> >> >> > -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10059M
>> >> >> > -Xmx10059M
>> >> >> > -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k
>> -XX:+UseParNewGC
>> >> >> > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
>> >> >> > -XX:SurvivorRatio=8
>> >> >> > -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
>> >> >> > -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true
>> >> >> > -Dcom.sun.management.jmxremote.port=7199
>> >> >> > -Dcom.sun.management.jmxremote.ssl=false
>> >> >> > -Dcom.sun.management.jmxremote.authenticate=false
>> >> >> > -Djava.rmi.server.hostname=10.19.104.14
>> >> >> > -Djava.net.preferIPv4Stack=true
>> >> >> > -Dlog4j.configuration=log4j-server.properties
>> >> >> > -Dlog4j.defaultInitOverride=true -cp
>> >> >> >
>> >> >> >
>> >> >> >
>> ./apache-cassandra-0.8.6/bin/../conf:./apache-cassandra-0.8.6/bin/../build/classes/main:./apache-cassandra-0.8.6/bin/../build/classes/thrift:./apache-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-sources-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandra-0.8.6/bin/../lib/commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-lang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.jar:./apache-cassandra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar:./apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./apache-cassandra-0.8.6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/libthrift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.2.16.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rimpl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/servlet-api-2.5-20081211.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/snakeyaml-1.6.jar
>> >> >> > org.apache.cassandra.thrift.CassandraDaemon
>> >> >> >
>> >> >> >
>> >> >> > Ring output
>> >> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool
-h
>> >> >> > 127.0.0.1
>> >> >> > ring
>> >> >> > Address         DC          Rack        Status State   Load
>> >> >> >  Owns
>> >> >> >    Token
>> >> >> >
>> >> >> >    141784319550391026443072753096570088105
>> >> >> > 10.19.104.11    datacenter1 rack1       Up     Normal  19.92
GB
>> >> >> >  16.67%  0
>> >> >> > 10.19.104.12    datacenter1 rack1       Up     Normal  19.3
GB
>> >> >> > 16.67%  28356863910078205288614550619314017621
>> >> >> > 10.19.104.13    datacenter1 rack1       Up     Normal  18.57
GB
>> >> >> >  16.67%  56713727820156410577229101238628035242
>> >> >> > 10.19.104.14    datacenter1 rack1       Up     Normal  19.34
GB
>> >> >> >  16.67%  85070591730234615865843651857942052863
>> >> >> > 10.19.105.11    datacenter1 rack1       Up     Normal  19.88
GB
>> >> >> >  16.67%  113427455640312821154458202477256070484
>> >> >> > 10.19.105.12    datacenter1 rack1       Up     Normal  20
GB
>> >> >> > 16.67%  141784319550391026443072753096570088105
>> >> >> > [root@CAP4-CNode4 apache-cassandra-0.8.6]#
>> >> >>
>> >> >> How many CFs? can you describe CF and post the configuration? Do
you
>> >> >> have row cache enabled?
>> >> >
>> >> >
>> >
>> >
>>
>
>

Mime
View raw message