incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: performance tuning - where does the slowness come from?
Date Tue, 04 May 2010 19:27:35 GMT
Are you using 32 bit hosts?  If not don't be scared of mmap using a
lot of address space, you have plenty.  It won't make you swap more
than using buffered i/o.

On Tue, May 4, 2010 at 1:57 PM, Ran Tavory <rantav@gmail.com> wrote:
> I canceled mmap and indeed memory usage is sane again. So far performance
> hasn't been great, but I'll wait and see.
> I'm also interested in a way to cap mmap so I can take advantage of it but
> not swap the host to death...
>
> On Tue, May 4, 2010 at 9:38 PM, Kyusik Chung <kyusik@discovereads.com>
> wrote:
>>
>> This sounds just like the slowness I was asking about in another thread -
>> after a lot of reads, the machine uses up all available memory on the box
>> and then starts swapping.
>> My understanding was that mmap helps greatly with read and write perf
>> (until the box starts swapping I guess)...is there any way to use mmap and
>> cap how much memory it takes up?
>> What do people use in production?  mmap or no mmap?
>> Thanks!
>> Kyusik Chung
>> On May 4, 2010, at 10:11 AM, Schubert Zhang wrote:
>>
>> 1. When initially startup your nodes, please plan your InitialToken of
>> each node evenly.
>> 2. <DiskAccessMode>standard</DiskAccessMode>
>>
>> On Tue, May 4, 2010 at 9:09 PM, Boris Shulman <shulmanb@gmail.com> wrote:
>>>
>>> I think that the extra (more than 4GB) memory usage comes from the
>>> mmaped io, that is why it happens only for reads.
>>>
>>> On Tue, May 4, 2010 at 2:02 PM, Jordan Pittier <jordan.pittier@gmail.com>
>>> wrote:
>>> > I'm facing the same issue with swap. It only occurs when I perform read
>>> > operations (write are very fast :)). So I can't help you with the
>>> > memory
>>> > probleme.
>>> >
>>> > But to balance the load evenly between nodes in cluster just manually
>>> > fix
>>> > their token.(the "formula" is i * 2^127 / nb_nodes).
>>> >
>>> > Jordzn
>>> >
>>> > On Tue, May 4, 2010 at 8:20 AM, Ran Tavory <rantav@gmail.com> wrote:
>>> >>
>>> >> I'm looking into performance issues on a 0.6.1 cluster. I see two
>>> >> symptoms:
>>> >> 1. Reads and writes are slow
>>> >> 2. One of the hosts is doing a lot of GC.
>>> >> 1 is slow in the sense that in normal state the cluster used to make
>>> >> around 3-5k read and writes per second (6-10k operations per second),
>>> >> but
>>> >> how it's in the order of 200-400 ops per second, sometimes even less.
>>> >> 2 looks like this:
>>> >> $ tail -f /outbrain/cassandra/log/system.log
>>> >>  INFO [GC inspection] 2010-05-04 00:42:18,636 GCInspector.java (line
>>> >> 110)
>>> >> GC for ParNew: 672 ms, 166482384 reclaimed leaving 2872087208 used;
>>> >> max is
>>> >> 4432068608
>>> >>  INFO [GC inspection] 2010-05-04 00:42:28,638 GCInspector.java (line
>>> >> 110)
>>> >> GC for ParNew: 498 ms, 166493352 reclaimed leaving 2836049448 used;
>>> >> max is
>>> >> 4432068608
>>> >>  INFO [GC inspection] 2010-05-04 00:42:38,640 GCInspector.java (line
>>> >> 110)
>>> >> GC for ParNew: 327 ms, 166091528 reclaimed leaving 2796888424 used;
>>> >> max is
>>> >> 4432068608
>>> >> ... and it goes on and on for hours, no stopping...
>>> >> The cluster is made of 6 hosts, 3 in one DC and 3 in another.
>>> >> Each host has 8G RAM.
>>> >> -Xmx=4G
>>> >> For some reason, the load isn't distributed evenly b/w the hosts,
>>> >> although
>>> >> I'm not sure this is the cause for slowness
>>> >> $ nodetool -h localhost -p 9004 ring
>>> >> Address       Status     Load          Range
>>> >>        Ring
>>> >>
>>> >> 144413773383729447702215082383444206680
>>> >> 192.168.252.99Up         15.94 GB
>>> >>  66002764663998929243644931915471302076     |<--|
>>> >> 192.168.254.57Up         19.84 GB
>>> >>  81288739225600737067856268063987022738     |   ^
>>> >> 192.168.254.58Up         973.78 MB
>>> >> 86999744104066390588161689990810839743     v   |
>>> >> 192.168.252.62Up         5.18 GB
>>> >> 88308919879653155454332084719458267849     |   ^
>>> >> 192.168.254.59Up         10.57 GB
>>> >>  142482163220375328195837946953175033937    v   |
>>> >> 192.168.252.61Up         11.36 GB
>>> >>  144413773383729447702215082383444206680    |-->|
>>> >> The slow host is 192.168.252.61 and it isn't the most loaded one.
>>> >> The host is waiting a lot on IO and the load average is usually 6-7
>>> >> $ w
>>> >>  00:42:56 up 11 days, 13:22,  1 user,  load average: 6.21, 5.52,
3.93
>>> >> $ vmstat 5
>>> >> procs -----------memory---------- ---swap-- -----io---- --system--
>>> >> -----cpu------
>>> >>  r  b   swpd   free   buff  cache   si   so    bi    bo
  in   cs us
>>> >> sy id
>>> >> wa st
>>> >>  0  8 2147844  45744   1816 4457384    6    5    66    32
   5    2  1
>>> >>  1
>>> >> 96  2  0
>>> >>  0  8 2147164  49020   1808 4451596  385    0  2345    58
3372 9957  2
>>> >>  2
>>> >> 78 18  0
>>> >>  0  3 2146432  45704   1812 4453956  342    0  2274   108
3937 10732
>>> >>  2  2
>>> >> 78 19  0
>>> >>  0  1 2146252  44696   1804 4453436  345  164  1939   294
3647 7833  2
>>> >>  2
>>> >> 78 18  0
>>> >>  0  1 2145960  46924   1744 4451260  158    0  2423   122
4354 14597
>>> >>  2  2
>>> >> 77 18  0
>>> >>  7  1 2138344  44676    952 4504148 1722  403  1722   406
1388  439 87
>>> >>  0
>>> >> 10  2  0
>>> >>  7  2 2137248  45652    956 4499436 1384  655  1384   658
1356  392 87
>>> >>  0
>>> >> 10  3  0
>>> >>  7  1 2135976  46764    956 4495020 1366  718  1366   718
1395  380 87
>>> >>  0
>>> >>  9  4  0
>>> >>  0  8 2134484  46964    956 4489420 1673  555  1814   586
1601 215590
>>> >> 14
>>> >>  2 68 16  0
>>> >>  0  1 2135388  47444    972 4488516  785  833  2390   995
3812 8305  2
>>> >>  2
>>> >> 77 20  0
>>> >>  0 10 2135164  45928    980 4488796  788  543  2275   626
36
>>> >> So, the host is swapping like crazy...
>>> >> top shows that it's using a lot of memory. As noted before -Xmx=4G and
>>> >> nothing else seems to be using a lot of memory on the host except for
>>> >> the
>>> >> cassandra process, however, of the 8G ram on the host, 92% is used by
>>> >> cassandra. How's that?
>>> >> Top shows there's 3.9g Shared and 7.2g Resident and 15.9g Virtual. Why
>>> >> does it have 15g virtual? And why 7.2 RES? This can explain the
>>> >> slowness in
>>> >> swapping.
>>> >> $ top
>>> >>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
 COMMAND
>>> >>
>>> >>
>>> >> 20281 cassandr  25   0 15.9g 7.2g 3.9g S 33.3 92.6 175:30.27 java
>>> >> So, can the total memory be controlled?
>>> >> Or perhaps I'm looking in the wrong direction...
>>> >> I've looked at all the cassandra JMX counts and nothing seemed
>>> >> suspicious
>>> >> so far. By suspicious i mean a large number of pending tasks - there
>>> >> were
>>> >> always very small numbers in each pool.
>>> >> About read and write latencies, I'm not sure what the normal state is,
>>> >> but
>>> >> here's an example of what I see on the problematic host:
>>> >> #mbean = org.apache.cassandra.service:type=StorageProxy:
>>> >> RecentReadLatencyMicros = 30105.888180684495;
>>> >> TotalReadLatencyMicros = 78543052801;
>>> >> TotalWriteLatencyMicros = 4213118609;
>>> >> RecentWriteLatencyMicros = 1444.4809201925639;
>>> >> ReadOperations = 4779553;
>>> >> RangeOperations = 0;
>>> >> TotalRangeLatencyMicros = 0;
>>> >> RecentRangeLatencyMicros = NaN;
>>> >> WriteOperations = 4740093;
>>> >> And the only pool that I do see some pending tasks is the
>>> >> ROW-READ-STAGE,
>>> >> but it doesn't look like much, usually around 6-8:
>>> >> #mbean = org.apache.cassandra.concurrent:type=ROW-READ-STAGE:
>>> >> ActiveCount = 8;
>>> >> PendingTasks = 8;
>>> >> CompletedTasks = 5427955;
>>> >> Any help finding the solution is appreciated, thanks...
>>> >> Below are a few more JMXes I collected from the system that may be
>>> >> interesting.
>>> >> #mbean = java.lang:type=Memory:
>>> >> Verbose = false;
>>> >> HeapMemoryUsage = {
>>> >>   committed = 3767279616;
>>> >>   init = 134217728;
>>> >>   max = 4293656576;
>>> >>   used = 1237105080;
>>> >>  };
>>> >> NonHeapMemoryUsage = {
>>> >>   committed = 35061760;
>>> >>   init = 24313856;
>>> >>   max = 138412032;
>>> >>   used = 23151320;
>>> >>  };
>>> >> ObjectPendingFinalizationCount = 0;
>>> >> #mbean = java.lang:name=ParNew,type=GarbageCollector:
>>> >> LastGcInfo = {
>>> >>   GcThreadCount = 11;
>>> >>   duration = 136;
>>> >>   endTime = 42219272;
>>> >>   id = 11719;
>>> >>   memoryUsageAfterGc = {
>>> >>     ( CMS Perm Gen ) = {
>>> >>       key = CMS Perm Gen;
>>> >>       value = {
>>> >>         committed = 29229056;
>>> >>         init = 21757952;
>>> >>         max = 88080384;
>>> >>         used = 17648848;
>>> >>        };
>>> >>      };
>>> >>     ( Code Cache ) = {
>>> >>       key = Code Cache;
>>> >>       value = {
>>> >>         committed = 5832704;
>>> >>         init = 2555904;
>>> >>         max = 50331648;
>>> >>         used = 5563520;
>>> >>        };
>>> >>      };
>>> >>     ( CMS Old Gen ) = {
>>> >>       key = CMS Old Gen;
>>> >>       value = {
>>> >>         committed = 3594133504;
>>> >>         init = 112459776;
>>> >>         max = 4120510464;
>>> >>         used = 964565720;
>>> >>        };
>>> >>      };
>>> >>     ( Par Eden Space ) = {
>>> >>       key = Par Eden Space;
>>> >>       value = {
>>> >>         committed = 171835392;
>>> >>         init = 21495808;
>>> >>         max = 171835392;
>>> >>         used = 0;
>>> >>        };
>>> >>      };
>>> >>     ( Par Survivor Space ) = {
>>> >>       key = Par Survivor Space;
>>> >>       value = {
>>> >>         committed = 1310720;
>>> >>         init = 131072;
>>> >>         max = 1310720;
>>> >>         used = 0;
>>> >>        };
>>> >>      };
>>> >>    };
>>> >>   memoryUsageBeforeGc = {
>>> >>     ( CMS Perm Gen ) = {
>>> >>       key = CMS Perm Gen;
>>> >>       value = {
>>> >>         committed = 29229056;
>>> >>         init = 21757952;
>>> >>         max = 88080384;
>>> >>         used = 17648848;
>>> >>        };
>>> >>      };
>>> >>     ( Code Cache ) = {
>>> >>       key = Code Cache;
>>> >>       value = {
>>> >>         committed = 5832704;
>>> >>         init = 2555904;
>>> >>         max = 50331648;
>>> >>         used = 5563520;
>>> >>        };
>>> >>      };
>>> >>     ( CMS Old Gen ) = {
>>> >>       key = CMS Old Gen;
>>> >>       value = {
>>> >>         committed = 3594133504;
>>> >>         init = 112459776;
>>> >>         max = 4120510464;
>>> >>         used = 959221872;
>>> >>        };
>>> >>      };
>>> >>     ( Par Eden Space ) = {
>>> >>       key = Par Eden Space;
>>> >>       value = {
>>> >>         committed = 171835392;
>>> >>         init = 21495808;
>>> >>         max = 171835392;
>>> >>         used = 171835392;
>>> >>        };
>>> >>      };
>>> >>     ( Par Survivor Space ) = {
>>> >>       key = Par Survivor Space;
>>> >>       value = {
>>> >>         committed = 1310720;
>>> >>         init = 131072;
>>> >>         max = 1310720;
>>> >>         used = 0;
>>> >>        };
>>> >>      };
>>> >>    };
>>> >>   startTime = 42219136;
>>> >>  };
>>> >> CollectionCount = 11720;
>>> >> CollectionTime = 4561730;
>>> >> Name = ParNew;
>>> >> Valid = true;
>>> >> MemoryPoolNames = [ Par Eden Space, Par Survivor Space ];
>>> >> #mbean = java.lang:type=OperatingSystem:
>>> >> MaxFileDescriptorCount = 63536;
>>> >> OpenFileDescriptorCount = 75;
>>> >> CommittedVirtualMemorySize = 17787711488;
>>> >> FreePhysicalMemorySize = 45522944;
>>> >> FreeSwapSpaceSize = 2123968512;
>>> >> ProcessCpuTime = 12251460000000;
>>> >> TotalPhysicalMemorySize = 8364417024;
>>> >> TotalSwapSpaceSize = 4294959104;
>>> >> Name = Linux;
>>> >> AvailableProcessors = 8;
>>> >> Arch = amd64;
>>> >> SystemLoadAverage = 4.36;
>>> >> Version = 2.6.18-164.15.1.el5;
>>> >> #mbean = java.lang:type=Runtime:
>>> >> Name = 20281@ob1061.nydc1.outbrain.com;
>>> >>
>>> >> ClassPath =
>>> >>
>>> >> /outbrain/cassandra/apache-cassandra-0.6.1/bin/../conf:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../build/classes:/outbrain/cassandra/apache-cassandra-0.6.1/bin/..
>>> >>
>>> >>
>>> >> /lib/antlr-3.1.3.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/apache-cassandra-0.6.1.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/avro-1.2.0-dev.jar:/outb
>>> >>
>>> >>
>>> >> rain/cassandra/apache-cassandra-0.6.1/bin/../lib/clhm-production.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/commons-cli-1.1.jar:/outbrain/cassandra/apache-cassandra-
>>> >>
>>> >>
>>> >> 0.6.1/bin/../lib/commons-codec-1.2.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/commons-collections-3.2.1.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/com
>>> >>
>>> >>
>>> >> mons-lang-2.4.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/google-collections-1.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/hadoop-core-0.20.1.jar:/out
>>> >>
>>> >>
>>> >> brain/cassandra/apache-cassandra-0.6.1/bin/../lib/high-scale-lib.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/ivy-2.1.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/
>>> >>
>>> >>
>>> >> bin/../lib/jackson-core-asl-1.4.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/jackson-mapper-asl-1.4.0.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/jline
>>> >>
>>> >>
>>> >> -0.9.94.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/json-simple-1.1.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/libthrift-r917130.jar:/outbrain/cassandr
>>> >>
>>> >>
>>> >> a/apache-cassandra-0.6.1/bin/../lib/log4j-1.2.14.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib/slf4j-api-1.5.8.jar:/outbrain/cassandra/apache-cassandra-0.6.1/bin/../lib
>>> >> /slf4j-log4j12-1.5.8.jar;
>>> >>
>>> >> BootClassPath =
>>> >>
>>> >> /usr/java/jdk1.6.0_17/jre/lib/alt-rt.jar:/usr/java/jdk1.6.0_17/jre/lib/resources.jar:/usr/java/jdk1.6.0_17/jre/lib/rt.jar:/usr/java/jdk1.6.0_17/jre/lib/sunrsasign.j
>>> >>
>>> >>
>>> >> ar:/usr/java/jdk1.6.0_17/jre/lib/jsse.jar:/usr/java/jdk1.6.0_17/jre/lib/jce.jar:/usr/java/jdk1.6.0_17/jre/lib/charsets.jar:/usr/java/jdk1.6.0_17/jre/classes;
>>> >>
>>> >> LibraryPath =
>>> >>
>>> >> /usr/java/jdk1.6.0_17/jre/lib/amd64/server:/usr/java/jdk1.6.0_17/jre/lib/amd64:/usr/java/jdk1.6.0_17/jre/../lib/amd64:/usr/java/packages/lib/amd64:/lib:/usr/lib;
>>> >>
>>> >> VmName = Java HotSpot(TM) 64-Bit Server VM;
>>> >>
>>> >> VmVendor = Sun Microsystems Inc.;
>>> >>
>>> >> VmVersion = 14.3-b01;
>>> >>
>>> >> BootClassPathSupported = true;
>>> >>
>>> >> InputArguments = [ -ea, -Xms128M, -Xmx4G, -XX:TargetSurvivorRatio=90,
>>> >> -XX:+AggressiveOpts, -XX:+UseParNewGC, -XX:+UseConcMarkSweepGC,
>>> >> -XX:+CMSParallelRemarkEnabled, -XX:+HeapDumpOnOutOfMemoryError,
>>> >> -XX:SurvivorRatio=128, -XX:MaxTenuringThreshold=0,
>>> >> -Dcom.sun.management.jmxremote.port=9004,
>>> >> -Dcom.sun.management.jmxremote.ssl=false,
>>> >> -Dcom.sun.management.jmxremote.authenticate=false,
>>> >>
>>> >> -Dstorage-config=/outbrain/cassandra/apache-cassandra-0.6.1/bin/../conf,
>>> >> -Dcassandra-pidfile=/var/run/cassandra.pid ];
>>> >>
>>> >> ManagementSpecVersion = 1.2;
>>> >>
>>> >> SpecName = Java Virtual Machine Specification;
>>> >>
>>> >> SpecVendor = Sun Microsystems Inc.;
>>> >>
>>> >> SpecVersion = 1.0;
>>> >>
>>> >> StartTime = 1272911001415;
>>> >> ...
>>> >
>>
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message