i'm observing the following on a cluster that started with 4 nodes.  i have been killing and restarting the various nodes as i test cassandra and now i'm seeing a lot of NotFoundException exceptions in the client because what i believe is ring state out of sync between the two nodes that are still up and available.  The first ring state shown below reflects the current state of the cluster.  Also I have seen similar issues when one of the nodes thinks another node is still available when in fact it has been killed.  it seems to be related to bringing up, killing nodes too fast and not letting them figure out when a node is "dead".  in this case i see TimedOutException related to NIO SocketChannel class.

thx!

[cassandra.883477]$ bin/nodeprobe -host gen-app02.dev.real.com -port 8080 ring
Address       Status     Load          Range                                      Ring
                                       144038903974614862325597275257769797985   
172.27.128.186Down       22.17 MB      31124469348629903091013930339840898757     |<--|
172.27.128.23 Down       22.17 MB      64378740291415296162944450043143967518     |   |
172.27.128.22 Up         22.17 MB      121134220722269938669001112695509564769    |   |
172.27.128.185Up         14.69 MB      144038903974614862325597275257769797985    |-->|

[cassandra.883477]$ bin/nodeprobe -host vmguest85.prognet.com -port 8080 ring
Address       Status     Load          Range                                      Ring
                                       144038903974614862325597275257769797985   
172.27.128.22 Up         22.17 MB      121134220722269938669001112695509564769    |<--|
172.27.128.185Up         14.69 MB      144038903974614862325597275257769797985    |-->|
[cassandra.883477]$