cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Frank Cooper <coop...@yahoo-inc.com>
Subject RE: Anybody experience one Cassandra server locking up?
Date Thu, 20 Aug 2009 00:19:37 GMT
We are trying to learn what we can about the performance of Cassandra. I hope to have some
results to share publicly in the next couple of weeks. 

The 0.4 version seems to have handled the insert load better, but is having trouble with a
50/50 read/write workload. One server again has a busy core with the other 7 cores (and the
other servers) idle or near idle. Any ideas? The problem seems to come when we dial up the
request rate made by the client; after a certain point, the achievable throughput slows way
down, even lower than what we could have achieved with a lower request rate. (Incidentally,
we are reading and writing 10 KB records; does the large data size have any impact?) And using
top -H, it looks like it is one of the Java threads that is consistently busy. Maybe it is
GC again.

I was hoping to chat with some of you Cassandra folks when we visited FB last week...perhaps
we can grab coffee sometime and chat about these issues...

Thanks!

brian
________________________________________
From: Sandeep Tata [sandeep.tata@gmail.com]
Sent: Wednesday, August 19, 2009 1:29 PM
To: cassandra-user@incubator.apache.org
Subject: Re: Anybody experience one Cassandra server locking up?

Brian,

Are you guys planning to run workloads at Yahoo to compare Cassandra and PNUTS?
We'd be curious to see what you learn with the 0.4/trunk code.

Sandeep

On Wed, Aug 19, 2009 at 10:20 AM, Brian Frank
Cooper<cooperb@yahoo-inc.com> wrote:
> Probably you are right; after Jun's response I looked in the log and saw an out of memory
exception. I'll try the 0.4 beta...
>
> Thanks!
>
> brian
>
> -----Original Message-----
> From: Jonathan Ellis [mailto:jbellis@gmail.com]
> Sent: Wednesday, August 19, 2009 9:12 AM
> To: cassandra-user@incubator.apache.org
> Subject: Re: Anybody experience one Cassandra server locking up?
>
> sounds like you are exhausting the memory on that instance and it is
> going into "GC swap" trying to free enough to continue.  this is very
> easy to do on 0.3 -- try upgrading to the 0.4 beta if you are using
> 0.3.
>
> On Tue, Aug 18, 2009 at 3:36 PM, Brian Frank
> Cooper<cooperb@yahoo-inc.com> wrote:
>> Hi folks,
>>
>>
>>
>> I have been loading a 6-server Cassandra cluster with 1KB records. After a
>> few million inserts, the insert rate drops dramatically. After
>> investigation, one of the Cassandra servers seems to be in a bad state,
>> using 100% of one core on an 8-core machine, and 0% on the other cores.
>> Inserts to this box have completely stopped, and the inserts to the other
>> boxes have slowed way down (more than a factor of 10 slower.) A "kill" or
>> "kill -3" to the bad java process does nothing; I have to use "kill -9" to
>> stop it. Has anybody experienced anything like this?
>>
>>
>>
>> Additional info:
>>
>>
>>
>> The servers are 8 core, 8GB servers. I am running 64 bit java 1.6, and here
>> are the JVM options:
>>
>>
>>
>> # Arguments to pass to the JVM
>>
>> JVM_OPTS=" \
>>
>>         -ea \
>>
>>         -Xdebug \
>>
>>         -Xrunjdwp:transport=dt_socket,server=y,address=8888,suspend=n \
>>
>>         -Xms128M \
>>
>>         -Xmx6G \
>>
>>         -XX:SurvivorRatio=8 \
>>
>>         -XX:TargetSurvivorRatio=90 \
>>
>>         -XX:+AggressiveOpts \
>>
>>         -XX:+UseParNewGC \
>>
>>         -XX:+UseConcMarkSweepGC \
>>
>>         -XX:CMSInitiatingOccupancyFraction=1 \
>>
>>         -XX:+CMSParallelRemarkEnabled \
>>
>>         -XX:+HeapDumpOnOutOfMemoryError \
>>
>>         -Dcom.sun.management.jmxremote.port=8080 \
>>
>>         -Dcom.sun.management.jmxremote.ssl=false \
>>
>>         -Dcom.sun.management.jmxremote.authenticate=false"
>>
>>
>>
>> (standard options from the Cassandra distribution, except for the 6GB of
>> heap space.)
>>
>>
>>
>> Replication factor is 1 (this is just a test, not a production setup) and
>> memtable size is set to 1GB.
>>
>>
>>
>> Thanks.
>>
>>
>>
>> brian
>

Mime
View raw message