cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Malone (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1358) Clogged RRS/RMS stages can hold up processing of gossip messages and request acks
Date Wed, 04 Aug 2010 21:47:17 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12895434#action_12895434
] 

Mike Malone commented on CASSANDRA-1358:
----------------------------------------

I'm continuing to dig deeper into this code while simultaneously nursing one of our cluster's
back to health, so I apologize for the sort of stream-of-consciousness here...

I noticed that several of the executor queues are bounded at 4096 tasks. Has there been much
thought put into that choice, or is it an arbitrary round number that someone picked? It seems
to me that bumping that number up a couple orders of magnitude or making it unbounded might
ameliorate the situation. Instead of having the stage executors filling up and pushing task
execution back onto the calling thread (which is single thread in the case of MDP) more messages
will stack up in the callee-queues. This should give the various queues a fair chance of processing
stuff they're interested in without being blocked by MDP (which is being blocked by some other
stage). There may be some slight memory overhead because deserialized objects will be in memory
instead of serialized ones, but that's a priced I'd be willing to pay.

I did find one possible reason to have an executor with a core pool size of 1, an unbounded
queue, and a maximumPoolSize > 1. It looks like the default RejectedExecutionHandler is
affected by maximumPoolSize. If it's > 1 then the default constructor assumes that tasks
can safely be scheduled in parallel, and it defaults to a "caller runs" policy if the queue
is full. But if the maximumPoolSize is 1, the rejected execution handler spins on offering
the task to the queue with a 1 second timeout. So if your maximum pool size is greater than
one you can basically use the calling threads for spare capacity.

Still, if that's the goal it should be made more explicit. I'm guessing the intent was to
give the MDP one thread per core under the assumption that it will be completely CPU bound.
But the implementation is borked for a number of reasons. Plus, if the MDP can block on other
stuff, the CPU bound assumption is wrong. If MDP can block, it should probably have a lot
more threads.

> Clogged RRS/RMS stages can hold up processing of gossip messages and request acks
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1358
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1358
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.5
>         Environment: All.
>            Reporter: Mike Malone
>             Fix For: 0.6.5
>
>
> The message deserialization process can become a bottleneck that prevents efficient resource
utilization because the executor that manages the deserialization process will never grow
beyond a single thread. The message deserializer executor is instantiated in the MessagingService
constructor as a JMXEnableThreadPoolExecutor, which extends java.util.concurrent.ThreadPoolExecutor.
The thread pool is instantiated with a corePoolSize of 1 and a maximumPoolSize of Runtime.getRuntime().availableProcessors().
But, according to the ThreadPoolExecutor documentation "using an unbounded queue (for example
a LinkedBlockingQueue without a predefined capacity) will cause new tasks to be queued in
cases where all corePoolSize threads are busy. Thus, no more than corePoolSize threads will
ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.)"
> The message deserializer pool uses a LinkedBlockingQueue, so there will never be more
than one deserialization thread. This issue became a problem in our production cluster when
the MESSAGE-DESERIALIZER-POOL began to back up on a node that was only lightly loaded. We
increased the core pool size to 4 and the situation improved, but the deserializer pool was
still backing up while the machine was not fully utilized (less than 100% CPU utilization).
This leads me to think that the deserializer thread is blocking on some sort of I/O, which
seems like it shouldn't happen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message