I can't reproduce this after cleaning out the data/ directories (and commit logs) and restarting the cluster. I suspect that I had gotten to an inconsistent state as I was playing around with the keyspace column families.

I've updated the bug report with a comment to this effect, but have left it open for someone with more experience to close it out.

Cheers,
Phil

On Aug 19, 2009, at 5:14 PM, Phillip Michalak wrote:

Sure thing. Filed as  https://issues.apache.org/jira/browse/CASSANDRA-381

Thanks,
Phil

On Aug 19, 2009, at 4:54 PM, Jonathan Ellis wrote:

Looks like a bug in TcpConnectionManager.  Can you file a ticket?

thanks,

-Jonathan

On Wed, Aug 19, 2009 at 2:49 PM, Phillip
Michalak<phil.michalak@digitalreasoning.com> wrote:
It's cassandra-0.4-beta1.

Thanks!
Phil

On Aug 19, 2009, at 4:43 PM, Jonathan Ellis wrote:

Is this 0.3 or 0.4/trunk?

On Wed, Aug 19, 2009 at 2:36 PM, Phillip
Michalak<phil.michalak@digitalreasoning.com> wrote:

I'm running three Cassandra nodes in virtual machines.
During a 'get' operation from Cassandra-remote directed at one of these
nodes, I'm receiving the following output

vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
192.168.133.130:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
"ColumnPath('strings', None, 'id')" 2
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:58:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 self.message = message
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:99:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 self.message = iprot.readString();
Traceback (most recent call last):
 File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
   pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 182, in get
   return self.recv_get()
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 201, in recv_get
   raise x

thrift.Thrift.TApplicationException/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:76:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 if self.message:
/usr/local/lib/python2.6/dist-packages/thrift/Thrift.py:77:
DeprecationWarning: BaseException.message has been deprecated as of
Python
2.6
 return self.message
: Internal error processing get

The same 'get' operation from Cassandra-remote directed at another of
these
nodes, yields 'normal' output
vadmin@vadmin:~/cassandra$ interface/gen-py/cassandra/Cassandra-remote -h
192.168.133.129:9160 get 'MockElementLibrary' '0401318uuuuruepwdcznr'
"ColumnPath('strings', None, 'id')" 2
Traceback (most recent call last):
 File "interface/gen-py/cassandra/Cassandra-remote", line 93, in <module>
   pp.pprint(client.get(args[0],args[1],eval(args[2]),eval(args[3]),))
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 182, in get
   return self.recv_get()
 File

"/home/vadmin/cassandra-0.4.0-beta1/interface/gen-py/cassandra/Cassandra.py",
line 210, in recv_get
   raise result.nfe
ttypes.NotFoundException: NotFoundException()
Furthermore, querying the same column for (some) other keys is successful
when no matter which node it is directed at.
Looking at the log for the node that produced the error from the query
above:

DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 CassandraServer.java
(line
221) get
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,618 StorageProxy.java (line
420) strongread reading data for
SliceByNamesReadCommand(table='MockElementLibrary',
key='0401318uuuuruepwdcznr',
columnParent='QueryPath(columnFamilyName='strings',
superColumnName='null',
columnName='null')', columns=[id,]) from 38184@null
DEBUG [pool-1-thread-22] 2009-08-19 16:54:57,619 StorageProxy.java (line
427) strongread reading digest for
SliceByNamesReadCommand(table='MockElementLibrary',
key='0401318uuuuruepwdcznr',
columnParent='QueryPath(columnFamilyName='strings',
superColumnName='null',
columnName='null')', columns=[id,]) from 38185@192.168.133.129:7000
 WARN [MESSAGE-SERIALIZER-POOL:4] 2009-08-19 16:54:57,619
MessageSerializationTask.java (line 81) Exception was generated at :
08/19/2009 16:54:57 on thread MESSAGE-SERIALIZER-POOL:4
java.lang.NullPointerException
       at
org.apache.cassandra.net.TcpConnection.<init>(TcpConnection.java:83)
       at

org.apache.cassandra.net.TcpConnectionManager.getConnection(TcpConnectionManager.java:64)
       at

org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:306)
       at

org.apache.cassandra.net.MessageSerializationTask.run(MessageSerializationTask.java:66)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
DEBUG [RESPONSE-STAGE:4] 2009-08-19 16:54:57,622 ResponseVerbHandler.java
(line 38) Processing response on a callback
from 65B1E352-A0A3-1A7F-138B-9BEA3E1D787F@192.168.133.129:7000
ERROR [pool-1-thread-22] 2009-08-19 16:55:02,619 Cassandra.java (line
608)
Internal error processing get
java.lang.RuntimeException: java.util.concurrent.TimeoutException:
Operation
timed out - received only 1 responses from 192.168.133.129:7000 .
       at

org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:100)
       at

org.apache.cassandra.service.CassandraServer.get(CassandraServer.java:226)
       at

org.apache.cassandra.service.Cassandra$Processor$get.process(Cassandra.java:602)
       at

org.apache.cassandra.service.Cassandra$Processor.process(Cassandra.java:560)
       at

org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
       at

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
       at java.lang.Thread.run(Thread.java:619)
Caused by: java.util.concurrent.TimeoutException: Operation timed out -
received only 1 responses from 192.168.133.129:7000 .
       at

org.apache.cassandra.service.QuorumResponseHandler.get(QuorumResponseHandler.java:86)
       at

org.apache.cassandra.service.StorageProxy.strongRead(StorageProxy.java:435)
       at

org.apache.cassandra.service.StorageProxy.readProtocol(StorageProxy.java:330)
       at

org.apache.cassandra.service.CassandraServer.readColumnFamily(CassandraServer.java:92)
       ... 7 more

It appears to me that there is a timeout during the QuorumResponseHandler
processing, stemming from a NullPointerException that occurs as part of
the
read process. I suspect that this NullPointerException has something to
do
with the second DEBUG [pool-1-thread-22] comment regarding strongread ...
from 38184@null.
Does anyone know why this might be happening?
Thanks for any insight,
Phil