It turns out that once a TProtocolException is thrown from Cassandra the connection is useless for future operations. Pelops was closing connections when it detected TimedOutException, TTransportException and UnavailableException but not TProtocolException.  We have now changed Pelops to close connections is all cases *except* NotFoundException.

Cheers,
-- 
Dan Washusen

On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote:

Pelops uses a single connection per operation from a pool that is backed by Apache Commons Pool (assuming you're using Cassandra 0.7).  I'm not saying it's perfect but it's NOT sharing a connection over multiple threads.

Dan Hendry mentioned that he sees these errors.  Is he also using Pelops?  From his comment about retrying I'd assume not...

-- 
Dan Washusen

On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote:

El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió:
"out of sequence response" is thrift's way of saying "I got a response
for request Y when I expected request X."

my money is on using a single connection from multiple threads. don't do that.

I'm not using thrift directly, and my application is single thread, so I
guess this is Pelops fault somehow. Since I managed to tame memory
comsuption the problem has not appeared again, but it always happened
during a stop-the-world GC. Could it be that the message was sent
instead of being dropped by the server when the client assumed it had
timed out?