Hi,

My version is  1.1.7

Our use case is : we have a index columnfamily to record how many resource is stored for a user. The number might vary from tens to millions.

We provide a feature to let user to delete resource according prefix.


 we found some cassandra will OOM after some period. The cluster is a kind of cross-datacenter ring.

1. Exception in cassandra log:

ERROR [Thread-5810] 2013-02-04 05:38:13,882 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5810,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(ThreadPoolExecutor.java:758)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:655)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113)
ERROR [Thread-5819] 2013-02-04 05:38:13,888 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-5819,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113)
ERROR [Thread-36] 2013-02-04 05:38:13,898 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-36,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113)
ERROR [Thread-3990] 2013-02-04 05:38:13,902 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[Thread-3990,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:581)
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:155)
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:113)
ERROR [ACCEPT-/10.139.50.62] AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ACCEPT-/10.139.50.62,5,main]
java.lang.RuntimeException: java.nio.channels.ClosedChannelException
at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:710)
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:137)
at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
at org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:699)
 INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 374) Timed out replaying hints to /23.20.84.240; aborting further deliveries
 INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 392) Finished hinted handoff of 0 rows to endpoint
 INFO [HintedHandoff:1] 2013-02-04 05:38:24,971 HintedHandOffManager.java (line 296) Started hinted handoff for token: 3

2. From heap dump, there are many deletedColumn found, rooted from thread readStage.


Pls help: where might be the problem?

Best Regards!

Jian Jin