incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carl Bruecken <carl.bruec...@corp.aol.com>
Subject Re: TTransportException intermittently in 0.7
Date Tue, 31 Aug 2010 22:54:52 GMT
  I've made some progress on narrowing this down and am able to 
reproduce easily.   I am using pelops as a client and I configured the 
policy in pelops to only establish 1 connection to a cassandra node.  
I'm able to step through the pelops code line by line and see the 
resulting thrift transport logging in cassandra.   Seems that flushing 
the transport causes the unwanted TTransportConnection in the server and 
subsequent closing of the connection.   The connection should stay open 
after flushing.   When there are many connection established the 
behaviour seems intermittent and many operations succeed.



Here are the details

1) The trigger from the client side is when the framed transport is flushed.
                conn.getAPI().batch_mutate(convertedBatch, cLevel);
                 // Flush connection
                 conn.flush();

2) In CustomTThreadPoolServer.java in Cassandra I modified the code to 
log TTransportExceptions.

         catch (TTransportException ttx) {
         LOGGER.error("Transport exception", ttx);
     } catch (TException tx) {
         LOGGER.error("Thrift error occurred during processing of 
message.", tx);
     } catch (Exception x) {
         LOGGER.error("Error occurred during processing of message.", x);
     }


3) Here is the exception that is ignored in cassandra.   Flushing the 
transport causes the server to believe the client has closed the connection.

org.apache.thrift.transport.TTransportException: Cannot read. Remote 
side has closed. Tried to read 4 bytes, but only got 0 bytes.
     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
     at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
     at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
     at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
     at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2487)
     at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:637)

4) The next batch mutate to this connection caused the exception in the 
client

  WARN [main] 2010-08-31 18:40:06,749 Operand.java (line 72) Operation 
failed as result of network exception. Connection must be destroyed.  
See cause for details...
org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Connection reset
     at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
     at 
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
     at 
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
     at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
     at 
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
     at 
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
     at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
     at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:905)
     at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:889)
     at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:42)
     at org.scale7.cassandra.pelops.Mutator$1.execute(Mutator.java:38)
     at org.scale7.cassandra.pelops.Operand.tryOperation(Operand.java:53)
     at org.scale7.cassandra.pelops.Mutator.execute(Mutator.java:49)
     at com.aol.data.c7.App.doWork(App.java:41)
     at com.aol.data.c7.App.main(App.java:77)
Caused by: java.net.SocketException: Connection reset
     at java.net.SocketInputStream.read(SocketInputStream.java:168)
     at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
     at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
     at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
     at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
     ... 15 more




On 8/31/10 4:04 PM, Jonathan Ellis wrote:
>
> No, I don't know that anyone has reproduced that.  TTransportException
> always means "something went wrong on the thrift side" in my
> experience, it shouldn't be cassandra-version specific.
>
> On Tue, Aug 31, 2010 at 12:53 PM, Carl Bruecken
> <carl.bruecken@corp.aol.com> wrote:
> >
> >  Are there any estimates as to when a fix for this will be checked into
> > trunk?
> >
> > Coincidentally, has anyone tracked down the issue?
> >
> >  I'm experiencing same issue with nightly build from a week ago.
> >
> > Thank You
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


Mime
View raw message