cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Freeman, Tim" <tim.free...@hp.com>
Subject RE: Persistently increasing read latency
Date Wed, 02 Dec 2009 01:31:10 GMT
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
>1) use jconsole to see what is happening to jvm / cassandra internals.
> possibly you are slowly exceeding cassandra's ability to keep up with
>writes, causing the jvm to spend more and more effort GCing to find
>enough memory to keep going

After looking at jconsole, I don't think so.  Cassandra has been up 18 hours.  It says it
has spent 2 hours 15 minutes on ParNew GC's and 1 second on ConcurrentMarkSweep, so we aren't
in the middle of GC death.  jconsole's view of the memory used has a sawtooth curve increasing
from around 0.05GB to 0.95GB in about 7 minutes, then dropping suddenly back to 0.05GB.  CPU
usage is 2-4% of one of four cores.  

Looking at the Cassandra mbean's, the attributes of ROW-MUTATION-STAGE and ROW-READ-STAGE
and RESPONSE-STAGE are all  less than 10.  MINOR-COMPACTION-POOL reports 1218 pending tasks.
 (I really like the idea of using mbean's for this.  Thanks for the example.)

>2) you should be at least on 0.4.2 and preferably trunk if you are
>stress testing

Fair enough. 

I left it running since my previous email, and the read latency seems to have maxed out at
1 second on the average at 50K seconds (13 hours) of run time.  Perhaps the problem is just
that I didn't wait long enough for the system to come into equilibrium.

Tim Freeman
Email: tim.freeman@hp.com
Desk in Palo Alto: (650) 857-2581
Home: (408) 774-1298
Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday; call my desk
instead.)


-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
Sent: Tuesday, December 01, 2009 11:10 AM
To: cassandra-user@incubator.apache.org
Subject: Re: Persistently increasing read latency

1) use jconsole to see what is happening to jvm / cassandra internals.
 possibly you are slowly exceeding cassandra's ability to keep up with
writes, causing the jvm to spend more and more effort GCing to find
enough memory to keep going

2) you should be at least on 0.4.2 and preferably trunk if you are
stress testing

-Jonathan

On Tue, Dec 1, 2009 at 12:11 PM, Freeman, Tim <tim.freeman@hp.com> wrote:
> In an 8 hour test run, I've seen the read latency for Cassandra drift fairly linearly
from ~460ms to ~900ms.  Eventually my application gets starved for reads and starts misbehaving.
 I have attached graphs -- horizontal scales are seconds, vertical scales are operations
per minute and average milliseconds per operation.  The clearest feature is the light blue
line in the left graph drifting consistently upward during the run.
>
> I have a Cassandra 0.4.1 database, one node, records are 100kbytes each, 350K records,
8 threads reading, around 700 reads per minute.  There are also 8 threads writing.  This
is all happening on a 4 core processor that's supporting both the Cassandra node and the code
that's generating load for it.  I'm reasonably sure that there are no page faults.
>
> I have attached my storage-conf.xml.  Briefly, it has default values, except RpcTimeoutInMillis
is 30000 and the partitioner is OrderPreservingPartitioner.  Cassandra's garbage collection
parameters are:
>
>   -Xms128m -Xmx1G -XX:SurvivorRatio=8 -XX:+AggressiveOpts -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
>
> Is this normal behavior?  Is there some change to the configuration I should make to
get it to stop getting slower?  If it's not normal, what debugging information should I gather?
 Should I give up on Cassandra 0.4.1 and move to a newer version?
>
> I'll leave it running for the time being in case there's something useful to extract
from it.
>
> Tim Freeman
> Email: tim.freeman@hp.com
> Desk in Palo Alto: (650) 857-2581
> Home: (408) 774-1298
> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday; call
my desk instead.)
>
>

Mime
View raw message