incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ran Tavory <ran...@gmail.com>
Subject Re: performance tuning - where does the slowness come from?
Date Tue, 04 May 2010 19:52:10 GMT
it's a 64bit host.
when I cancel mmap I see less memory used and zero swapping, but it's slowly
growing so I'll have to wait and see.
Performance isn't much better, not sure what's the bottleneck now (could
also be the application).

Now on the same host I see:
top - 15:43:59 up 12 days,  4:23,  1 user,  load average: 0.29, 0.68, 1.53
Tasks: 152 total,   1 running, 151 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.1%us,  0.5%sy,  0.0%ni, 97.8%id,  0.3%wa,  0.0%hi,  0.2%si,
 0.0%st
Mem:   8168376k total,  8120364k used,    48012k free,     2540k buffers
Swap:  4194296k total,    12816k used,  4181480k free,  5028672k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  SWAP nFLT
COMMAND

25122 cassandr  22   0 4943m 2.9g   9m S 12.6 36.7  35:39.53 2.0g  141 java


$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
wa st
 1  0  12816  46656   2664 5021340    8    6    79    34    3    1  1  1 95
 3  0
 0  0  12816  48180   2672 5019460    0    0   282     9 1913 2450  2  1 97
 0  0
 0  0  12816  45064   2688 5020688    0    0   282    83 1850 2303  1  1 97
 0  0
 0  0  12816  47612   2696 5017520    0    0   102    59 1884 2328  1  1 98
 0  0


On Tue, May 4, 2010 at 10:27 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> Are you using 32 bit hosts?  If not don't be scared of mmap using a
> lot of address space, you have plenty.  It won't make you swap more
> than using buffered i/o.
>
> On Tue, May 4, 2010 at 1:57 PM, Ran Tavory <rantav@gmail.com> wrote:
> > I canceled mmap and indeed memory usage is sane again. So far performance
> > hasn't been great, but I'll wait and see.
> > I'm also interested in a way to cap mmap so I can take advantage of it
> but
> > not swap the host to death...
> >
> > On Tue, May 4, 2010 at 9:38 PM, Kyusik Chung <kyusik@discovereads.com>
> > wrote:
> >>
> >> This sounds just like the slowness I was asking about in another thread
> -
> >> after a lot of reads, the machine uses up all available memory on the
> box
> >> and then starts swapping.
> >> My understanding was that mmap helps greatly with read and write perf
> >> (until the box starts swapping I guess)...is there any way to use mmap
> and
> >> cap how much memory it takes up?
> >> What do people use in production?  mmap or no mmap?
> >> Thanks!
> >> Kyusik Chung
> >> On May 4, 2010, at 10:11 AM, Schubert Zhang wrote:
> >>
> >> 1. When initially startup your nodes, please plan your InitialToken of
> >> each node evenly.
> >> 2. <DiskAccessMode>standard</DiskAccessMode>
> >>
> >> On Tue, May 4, 2010 at 9:09 PM, Boris Shulman <shulmanb@gmail.com>
> wrote:
> >>>
> >>> I think that the extra (more than 4GB) memory usage comes from the
> >>> mmaped io, that is why it happens only for reads.
> >>>
> >>> On Tue, May 4, 2010 at 2:02 PM, Jordan Pittier <
> jordan.pittier@gmail.com>
> >>> wrote:
> >>> > I'm facing the same issue with swap. It only occurs when I perform
> read
> >>> > operations (write are very fast :)). So I can't help you with the
> >>> > memory
> >>> > probleme.
> >>> >
> >>> > But to balance the load evenly between nodes in cluster just manually
> >>> > fix
> >>> > their token.(the "formula" is i * 2^127 / nb_nodes).
> >>> >
> >>> > Jordzn
> >>> >
> >>> > On Tue, May 4, 2010 at 8:20 AM, Ran Tavory <rantav@gmail.com>
wrote:
> >>> >>
> >>> >> I'm looking into performance issues on a 0.6.1 cluster. I see two
> >>> >> symptoms:
> >>> >> 1. Reads and writes are slow
> >>> >> 2. One of the hosts is doing a lot of GC.
> >>> >> 1 is slow in the sense that in normal state the cluster used to
make
> >>> >> around 3-5k read and writes per second (6-10k operations per
> second),
> >>> >> but
> >>> >> how it's in the order of 200-400 ops per second, sometimes even
> less.
> >>> >> 2 looks like this:
> >>> >> $ tail -f /outbrain/cassandra/log/system.log
> >>> >>  INFO [GC inspection] 2010-05-04 00:42:18,636 GCInspector.java
(line
> >>> >> 110)
> >>> >> GC for ParNew: 672 ms, 166482384 reclaimed leaving 2872087208 used;
> >>> >> max is
> >>> >> 4432068608
> >>> >>  INFO [GC inspection] 2010-05-04 00:42:28,638 GCInspector.java
(line
> >>> >> 110)
> >>> >> GC for ParNew: 498 ms, 166493352 reclaimed leaving 2836049448 used;
> >>> >> max is
> >>> >> 4432068608
> >>> >>  INFO [GC inspection] 2010-05-04 00:42:38,640 GCInspector.java
(line
> >>> >> 110)
> >>> >> GC for ParNew: 327 ms, 166091528 reclaimed leaving 2796888424 used;
> >>> >> max is
> >>> >> 4432068608
> >>> >> ... and it goes on and on for hours, no stopping...
> >>> >> The cluster is made of 6 hosts, 3 in one DC and 3 in another.
> >>> >> Each host has 8G RAM.
> >>> >> -Xmx=4G
> >>> >> For some reason, the load isn't distributed evenly b/w the hosts,
> >>> >> although
> >>> >> I'm not sure this is the cause for slowness
> >>> >> $ nodetool -h localhost -p 9004 ring
> >>> >> Address       Status     Load          Range
> >>> >>        Ring
> >>> >>
> >>> >> 144413773383729447702215082383444206680
> >>> >> 192.168.252.99Up         15.94 GB
> >>> >>  66002764663998929243644931915471302076     |<--|
> >>> >> 192.168.254.57Up         19.84 GB
> >>> >>  81288739225600737067856268063987022738     |   ^
> >>> >> 192.168.254.58Up         973.78 MB
> >>> >> 86999744104066390588161689990810839743     v   |
> >>> >> 192.168.252.62Up         5.18 GB
> >>> >> 88308919879653155454332084719458267849     |   ^
> >>> >> 192.168.254.59Up         10.57 GB
> >>> >>  142482163220375328195837946953175033937    v   |
> >>> >> 192.168.252.61Up         11.36 GB
> >>> >>  144413773383729447702215082383444206680    |-->|
> >>> >> The slow host is 192.168.252.61 and it isn't the most loaded one.
> >>> >> The host is waiting a lot on IO and the load average is usually
6-7
> >>> >> $ w
> >>> >>  00:42:56 up 11 days, 13:22,  1 user,  load average: 6.21, 5.52,
> 3.93
> >>> >> $ vmstat 5
> >>> >> procs -----------memory---------- ---swap-- -----io---- --system--
> >>> >> -----cpu------
> >>> >>  r  b   swpd   free   buff  cache   si   so    bi    bo   in  
cs us
> >>> >> sy id
> >>> >> wa st
> >>> >>  0  8 2147844  45744   1816 4457384    6    5    66    32    5
   2
>  1
> >>> >>  1
> >>> >> 96  2  0
> >>> >>  0  8 2147164  49020   1808 4451596  385    0  2345    58 3372
9957
>  2
> >>> >>  2
> >>> >> 78 18  0
> >>> >>  0  3 2146432  45704   1812 4453956  342    0  2274   108 3937
10732
> >>> >>  2  2
> >>> >> 78 19  0
> >>> >>  0  1 2146252  44696   1804 4453436  345  164  1939   294 3647
7833
>  2
> >>> >>  2
> >>> >> 78 18  0
> >>> >>  0  1 2145960  46924   1744 4451260  158    0  2423   122 4354
14597
> >>> >>  2  2
> >>> >> 77 18  0
> >>> >>  7  1 2138344  44676    952 4504148 1722  403  1722   406 1388
 439
> 87
> >>> >>  0
> >>> >> 10  2  0
> >>> >>  7  2 2137248  45652    956 4499436 1384  655  1384   658 1356
 392
> 87
> >>> >>  0
> >>> >> 10  3  0
> >>> >>  7  1 2135976  46764    956 4495020 1366  718  1366   718 1395
 380
> 87
> >>> >>  0
> >>> >>  9  4  0
> >>> >>  0  8 2134484  46964    956 4489420 1673  555  1814   586 1601
> 215590
> >>> >> 14
> >>> >>  2 68 16  0
> >>> >>  0  1 2135388  47444    972 4488516  785  833  2390   995 3812
8305
>  2
> >>> >>  2
> >>> >> 77 20  0
> >>> >>  0 10 2135164  45928    980 4488796  788  543  2275   626 36
> >>> >> So, the host is swapping like crazy...
> >>> >> top shows that it's using a lot of memory. As noted before -Xmx=4G
> and
> >>> >> nothing else seems to be using a lot of memory on the host except
> for
> >>> >> the
> >>> >> cassandra process, however, of the 8G ram on the host, 92% is used
> by
> >>> >> cassandra. How's that?
> >>> >> Top shows there's 3.9g Shared and 7.2g Resident and 15.9g Virtual.
> Why
> >>> >> does it have 15g virtual? And why 7.2 RES? This can explain the
> >>> >> slowness in
> >>> >> swapping.
> >>> >> $ top
> >>> >>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> >>> >>
> >>> >>
> >>> >> 20281 cassandr  25   0 15.9g 7.2g 3.9g S 33.3 92.6 175:30.27 java
> >>> >> So, can the total memory be controlled?
> >>> >> Or perhaps I'm looking in the wrong direction...
> >>> >> I've looked at all the cassandra JMX counts and nothing seemed
> >>> >> suspicious
> >>> >> so far. By suspicious i mean a large number of pending tasks -
there
> >>> >> were
> >>> >> always very small numbers in each pool.
> >>> >> About read and write latencies, I'm not sure what the normal state
> is,
> >>> >> but
> >>> >> here's an example of what I see on the problematic host:
> >>> >> #mbean = org.apache.cassandra.service:type=StorageProxy:
> >>> >> RecentReadLatencyMicros = 30105.888180684495;
> >>> >> TotalReadLatencyMicros = 78543052801;
> >>> >> TotalWriteLatencyMicros = 4213118609;
> >>> >> RecentWriteLatencyMicros = 1444.4809201925639;
> >>> >> ReadOperations = 4779553;
> >>> >> RangeOperations = 0;
> >>> >> TotalRangeLatencyMicros = 0;
> >>> >> RecentRangeLatencyMicros = NaN;
> >>> >> WriteOperations = 4740093;
> >>> >> And the only pool that I do see some pending tasks is the
> >>> >> ROW-READ-STAGE,
> >>> >> but it doesn't look like much, usually around 6-8:
> >>> >> #mbean = org.apache.cassandra.concurrent:type=ROW-READ-STAGE:
> >>> >> ActiveCount = 8;
> >>> >> PendingTasks = 8;
> >>> >> CompletedTasks = 5427955;
> >>> >> Any help finding the solution is appreciated, thanks...
> >>> >> Below are a few more JMXes I collected from the system that may
be
> >>> >> interesting.
> >>> >> #mbean = java.lang:type=Memory:
> >>> >> Verbose = false;
> >>> >> HeapMemoryUsage = {
> >>> >>   committed = 3767279616;
> >>> >>   init = 134217728;
> >>> >>   max = 4293656576;
> >>> >>   used = 1237105080;
> >>> >>  };
> >>> >> NonHeapMemoryUsage = {
> >>> >>   committed = 35061760;
> >>> >>   init = 24313856;
> >>> >>   max = 138412032;
> >>> >>   used = 23151320;
> >>> >>  };
> >>> >> ObjectPendingFinalizationCount = 0;
> >>> >> #mbean = java.lang:name=ParNew,type=GarbageCollector:
> >>> >> LastGcInfo = {
> >>> >>   GcThreadCount = 11;
> >>> >>   duration = 136;
> >>> >>   endTime = 42219272;
> >>> >>   id = 11719;
> >>> >>   memoryUsageAfterGc = {
> >>> >>     ( CMS Perm Gen ) = {
> >>> >>       key = CMS Perm Gen;
> >>> >>       value = {
> >>> >>         committed = 29229056;
> >>> >>         init = 21757952;
> >>> >>         max = 88080384;
> >>> >>         used = 17648848;
> >>> >>        };
> >>> >>      };
> >>> >>     ( Code Cache ) = {
> >>> >>       key = Code Cache;
> >>> >>       value = {
> >>> >>         committed = 5832704;
> >>> >>         init = 2555904;
> >>> >>         max = 50331648;
> >>> >>         used = 5563520;
> >>> >>        };
> >>> >>      };
> >>> >>     ( CMS Old Gen ) = {
> >>> >>       key = CMS Old Gen;
> >>> >>       value = {
> >>> >>         committed = 3594133504;
> >>> >>         init = 112459776;
> >>> >>         max = 4120510464;
> >>> >>         used = 964565720;
> >>> >>        };
> >>> >>      };
> >>> >>     ( Par Eden Space ) = {
> >>> >>       key = Par Eden Space;
> >>> >>       value = {
> >>> >>         committed = 171835392;
> >>> >>         init = 21495808;
> >>> >>         max = 171835392;
> >>> >>         used = 0;
> >>> >>        };
> >>> >>      };
> >>> >>     ( Par Survivor Space ) = {
> >>> >>       key = Par Survivor Space;
> >>> >>       value = {
> >>> >>         committed = 1310720;
> >>> >>         init = 131072;
> >>> >>         max = 1310720;
> >>> >>         used = 0;
> >>> >>        };
> >>> >>      };
> >>> >>    };
> >>> >>   memoryUsageBeforeGc = {
> >>> >>     ( CMS Perm Gen ) = {
> >>> >>       key = CMS Perm Gen;
> >>> >>       value = {
> >>> >>         committed = 29229056;
> >>> >>         init = 21757952;
> >>> >>         max = 88080384;
> >>> >>         used = 17648848;
> >>> >>        };
> >>> >>      };
> >>> >>     ( Code Cache ) = {
> >>> >>       key = Code Cache;
> >>> >>       value = {
> >>> >>         committed = 5832704;
> >>> >>         init = 2555904;
> >>> >>         max = 50331648;
> >>> >>         used = 5563520;
> >>> >>        };
> >>> >>      };
> >>> >>     ( CMS Old Gen ) = {
> >>> >>       key = CMS Old Gen;
> >>> >>       value = {
> >>> >>         committed = 3594133504;
> >>> >>         init = 112459776;
> >>> >>         max = 4120510464;
> >>> >>         used = 959221872;
> >>> >>        };
> >>> >>      };
> >>> >>     ( Par Eden Space ) = {
> >>> >>       key = Par Eden Space;
> >>> >>       value = {
> >>> >>         committed = 171835392;
> >>> >>         init = 21495808;
> >>> >>         max = 171835392;
> >>> >>         used = 171835392;
> >>> >>        };
> >>> >>      };
> >>> >>     ( Par Survivor Space ) = {
> >>> >>       key = Par Survivor Space;
> >>> >>       value = {
> >>> >>         committed = 1310720;
> >>> >>         init = 131072;
> >>> >>         max = 1310720;
> >>> >>         used = 0;
> >>> >>        };
> >>> >>      };
> >>> >>    };
> >>> >>   startTime = 42219136;
> >>> >>  };
> >>> >> CollectionCount = 11720;
> >>> >> CollectionTime = 4561730;
> >>> >> Name = ParNew;
> >>> >> Valid = true;
> >>> >> MemoryPoolNames = [ Par Eden Space, Par Survivor Space ];
> >>> >> #mbean = java.lang:type=OperatingSystem:
> >>> >> MaxFileDescriptorCount = 63536;
> >>> >> OpenFileDescriptorCount = 75;
> >>> >> CommittedVirtualMemorySize = 17787711488;
> >>> >> FreePhysicalMemorySize = 45522944;
> >>> >> FreeSwapSpaceSize = 2123968512;
> >>> >> ProcessCpuTime = 12251460000000;
> >>> >> TotalPhysicalMemorySize = 8364417024;
> >>> >> TotalSwapSpaceSize = 4294959104;
> >>> >> Name = Linux;
> >>> >> AvailableProcessors = 8;
> >>> >> Arch = amd64;
> >>> >> SystemLoadAverage = 4.36;
> >>> >> Version = 2.6.18-164.15.1.el5;
> >>> >> #mbean = java.lang:type=Runtime:
> >>> >> Name = 20281@ob1061.nydc1.outbrain.com;
> >>> >>
> >>> >> ClassPath =
> >>> >>
> >>> >>
> /outbrain/cassandra/apache-cassandra-0.6.1/bin/../conf:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../build/classes:/outbrain/cassandra/apache-cassandra-0.6.1/bin/..
> >>> >>
> >>> >>
> >>> >>
> /lib/antlr-3.1.3.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/apache-cassandra-0.6.1.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/avro-1.2.0-dev.jar:/outb
> >>> >>
> >>> >>
> >>> >>
> rain/cassandra/apache-cassandra-0.6.1/bin/../lib/clhm-production.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/commons-cli-1.1.jar:/outbrain/cassandra/apache-cassandra-
> >>> >>
> >>> >>
> >>> >>
> 0.6.1/bin/../lib/commons-codec-1.2.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/commons-collections-3.2.1.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/com
> >>> >>
> >>> >>
> >>> >>
> mons-lang-2.4.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/google-collections-1.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/hadoop-core-0.20.1.jar:/out
> >>> >>
> >>> >>
> >>> >>
> brain/cassandra/apache-cassandra-0.6.1/bin/../lib/high-scale-lib.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/ivy-2.1.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/
> >>> >>
> >>> >>
> >>> >>
> bin/../lib/jackson-core-asl-1.4.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/jackson-mapper-asl-1.4.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/jline
> >>> >>
> >>> >>
> >>> >>
> -0.9.94.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/json-simple-1.1.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/libthrift-r917130.jar:/outbrain/cassandr
> >>> >>
> >>> >>
> >>> >>
> a/apache-cassandra-0.6.1/bin/../lib/log4j-1.2.14.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/slf4j-api-1.5.8.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib
> >>> >> /slf4j-log4j12-1.5.8.jar;
> >>> >>
> >>> >> BootClassPath =
> >>> >>
> >>> >>
> /usr/java/jdk1.6.0_17/jre/lib/alt-rt.jar:/usr/java/jdk1.6.0_17/jre/lib/resources.jar:/usr/java/jdk1.6.0_17/jre/lib/rt.jar:/usr/java/jdk1.6.0_17/jre/lib/sunrsasign.j
> >>> >>
> >>> >>
> >>> >>
> ar:/usr/java/jdk1.6.0_17/jre/lib/jsse.jar:/usr/java/jdk1.6.0_17/jre/lib/jce.jar:/usr/java/jdk1.6.0_17/jre/lib/charsets.jar:/usr/java/jdk1.6.0_17/jre/classes;
> >>> >>
> >>> >> LibraryPath =
> >>> >>
> >>> >>
> /usr/java/jdk1.6.0_17/jre/lib/amd64/server:/usr/java/jdk1.6.0_17/jre/lib/amd64:/usr/java/jdk1.6.0_17/jre/../lib/amd64:/usr/java/packages/lib/amd64:/lib:/usr/lib;
> >>> >>
> >>> >> VmName = Java HotSpot(TM) 64-Bit Server VM;
> >>> >>
> >>> >> VmVendor = Sun Microsystems Inc.;
> >>> >>
> >>> >> VmVersion = 14.3-b01;
> >>> >>
> >>> >> BootClassPathSupported = true;
> >>> >>
> >>> >> InputArguments = [ -ea, -Xms128M, -Xmx4G,
> -XX:TargetSurvivorRatio=90,
> >>> >> -XX:+AggressiveOpts, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC,
> >>> >> -XX:+CMSParallelRemarkEnabled, -XX:+HeapDumpOnOutOfMemoryError,
> >>> >> -XX:SurvivorRatio=128, -XX:MaxTenuringThreshold=0,
> >>> >> -Dcom.sun.management.jmxremote.port=9004,
> >>> >> -Dcom.sun.management.jmxremote.ssl=false,
> >>> >> -Dcom.sun.management.jmxremote.authenticate=false,
> >>> >>
> >>> >>
> -Dstorage-config=/outbrain/cassandra/apache-cassandra-0.6.1/bin/../conf,
> >>> >> -Dcassandra-pidfile=/var/run/cassandra.pid ];
> >>> >>
> >>> >> ManagementSpecVersion = 1.2;
> >>> >>
> >>> >> SpecName = Java Virtual Machine Specification;
> >>> >>
> >>> >> SpecVendor = Sun Microsystems Inc.;
> >>> >>
> >>> >> SpecVersion = 1.0;
> >>> >>
> >>> >> StartTime = 1272911001415;
> >>> >> ...
> >>> >
> >>
> >>
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Mime
View raw message