So we created a script to check if Cassandra is alive and run it every two minutes. Here are some results for today:

Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 19:00:10 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 19:30:10 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 20:02:10 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 21:34:10 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 22:06:10 UTC 2011 - F this Cassandra bullshit... it died again


And here are some of the log tails:

 INFO [CompactionExecutor:1] 2011-10-11 18:58:14,909 CompactionManager.java (line 395) Compacting []
 INFO [FlushWriter:10] 2011-10-11 18:58:14,951 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/
system/HintsColumnFamily-f-568-Data.db (60 bytes)
 INFO [FlushWriter:10] 2011-10-11 18:58:14,951 Memtable.java (line 157) Writing Memtable-HintsColumnFamily@1493400027(0 bytes, 1 operations)
 INFO [FlushWriter:10] 2011-10-11 18:58:14,991 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/HintsColumnFamily-f-569-Data.db (61 bytes)
 INFO [FlushWriter:10] 2011-10-11 18:58:14,991 Memtable.java (line 157) Writing Memtable-HintsColumnFamily@1932871300(0 bytes, 1 operations)
 INFO [FlushWriter:10] 2011-10-11 18:58:15,031 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/HintsColumnFamily-f-570-Data.db (61 bytes)

INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/
system/HintsColumnFamily-f-1066
 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1098
 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1040
 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,906 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1071
 INFO [NonPeriodicTasks:1] 2011-10-11 19:29:20,907 SSTable.java (line 147) Deleted /var/lib/cassandra/data/system/HintsColumnFamily-f-1093

INFO [FlushWriter:8] 2011-10-11 20:00:10,701 Memtable.java (line 157) Writing Memtable-HintsColumnFamily@
1488536311(0 bytes, 1 operations)
 INFO [CompactionExecutor:1] 2011-10-11 20:00:10,701 CompactionManager.java (line 395) Compacting [SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1687-Data.db'),SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1688-Data.db'),SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1689-Data.db'),SSTableReader(path='/var/lib/cassandra/data/system/HintsColumnFamily-f-1690-Data.db')]
 INFO [FlushWriter:8] 2011-10-11 20:00:10,741 Memtable.java (line 172) Completed flushing /var/lib/cassandra/data/system/HintsColumnFamily-f-1691-Data.db (61 bytes)

 INFO [NonPeriodicTasks:1] 2011-10-11 21:33:26,980 SSTable.java (line 147) Deleted /var/lib/cassandra/data/
system/HintsColumnFamily-f-3349
ERROR [Thread-18] 2011-10-11 21:33:31,452 AbstractCassandraDaemon.java (line 132) Fatal exception in thread Thread[Thread-18,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
       at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:76)
       at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
       at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
       at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:385)
       at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:114)

ERROR [Thread-19] 2011-10-11 22:04:39,195 AbstractCassandraDaemon.java (line 132) Fatal exception in thread Thread[Thread-19,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
       at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:76)
       at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816)
       at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337)
       at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:385)
       at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:114)


I'm going to increase the logging level to DEBUG. Other than that I've got to say that Cassandra 0.7.9 is F'ed in some way or another.