cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edmond Lau <edm...@ooyala.com>
Subject cluster locks up from high MESSAGE-DESERIALIZER-POOL counts
Date Thu, 27 May 2010 19:59:20 GMT
Occasionally, one of my six nodes gets a very high
MESSAGE-DESERIALIZER-POOL pending count (over 100K).  When that
happens, it usually also has a decently high ROW-READ-STAGE pending
count around 4K.  All other nodes have very low load and no pending
tasks.  From reading other threads, this is usually a symptom of GC
occurring.

When this scenario happens, my multiget_slice() queries with a
consistency level of one across ~128 keys typically fail to return
within 30 seconds, even though normally they return in under 50 ms.  I
would've expected that with a consistency level one, Cassandra
should've been able to bypass the locked up node.  My understanding is
that the coordinator node would issue multiple parallel lookups for a
key and just wait until the first out of three returns.

I'm using a random partitioner, a replication factor of 3 with
rack-aware partitioning, and machines with 32GB of RAM and 12GB
allocated to the java heap.  I've set my cassandra rpc timeout to 30
seconds.

Anyone have thoughts about why this might happen?

Edmond

Mime
View raw message