An example scenario (that is now fixed in Pelops):
  1. Attempt to write a column with a null value
  2. Cassandra throws a TProtocolException which renders the connection useless for future operations
  3. Pelops returns the corrupt connection to the pool
  4. A second read operation is attempted with the corrupt connection and Cassandra throws an ApplicationException
A Pelops test case for this can be found here:
https://github.com/s7/scale7-pelops/blob/3fe7584a24bb4b62b01897a814ef62415bd2fe43/src/test/java/org/scale7/cassandra/pelops/MutatorIntegrationTest.java#L262

Cheers,
-- 
Dan Washusen

On Tuesday, 19 April 2011 at 10:28 AM, Jonathan Ellis wrote:

Any idea what's causing the original TPE?

On Mon, Apr 18, 2011 at 6:22 PM, Dan Washusen <dan@reactive.org> wrote:
It turns out that once a TProtocolException is thrown from Cassandra the
connection is useless for future operations. Pelops was closing connections
when it detected TimedOutException, TTransportException and
UnavailableException but not TProtocolException.  We have now changed Pelops
to close connections is all cases *except* NotFoundException.

Cheers,
--
Dan Washusen

On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote:

Pelops uses a single connection per operation from a pool that is backed by
Apache Commons Pool (assuming you're using Cassandra 0.7).  I'm not saying
it's perfect but it's NOT sharing a connection over multiple threads.
Dan Hendry mentioned that he sees these errors.  Is he also using Pelops?
 From his comment about retrying I'd assume not...

--
Dan Washusen

On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote:

El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió:

"out of sequence response" is thrift's way of saying "I got a response
for request Y when I expected request X."

my money is on using a single connection from multiple threads. don't do
that.

I'm not using thrift directly, and my application is single thread, so I
guess this is Pelops fault somehow. Since I managed to tame memory
comsuption the problem has not appeared again, but it always happened
during a stop-the-world GC. Could it be that the message was sent
instead of being dropped by the server when the client assumed it had
timed out?



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com