cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lohfink <clohf...@apple.com>
Subject Re: Understanding Blocked and All Time Blocked columns in tpstats
Date Fri, 23 Mar 2018 17:29:49 GMT
Increasing queue would increase the number of requests waiting. It could make GCs worse if
the requests are like large INSERTs, but for a lot of super tiny queries it helps to increase
queue size (to a point). Might want to look into what and how queries are being made, since
there are possibly options to help with that (ie prepared queries, what queries are, limiting
number of async inflight queries)

Chris

> On Mar 23, 2018, at 11:42 AM, John Sanda <john.sanda@gmail.com> wrote:
> 
> Thanks for the explanation. In the past when I have run into problems related to CASSANDRA-11363,
I have increased the queue size via the cassandra.max_queued_native_transport_requests system
property. If I find that the queue is frequently at capacity, would that be an indicator that
the node is having trouble keeping up with the load? And if so, will increasing the queue
size just exacerbate the problem?
> 
> On Fri, Mar 23, 2018 at 11:51 AM, Chris Lohfink <clohfink@apple.com <mailto:clohfink@apple.com>>
wrote:
> It blocks the caller attempting to add the task until theres room in queue, applying
back pressure. It does not reject it. It mimics the behavior from pre-SEP DebuggableThreadPoolExecutor's
RejectionExecutionHandler that the other thread pools use (exception on sampling/trace which
just throw away on rejections).
> 
> Worth noting this is only really possible in the native transport pool (sep pool) last
I checked. Since 2.1 at least, before that there were a few others. That changes version to
version. For (basically) all other thread pools the queue is limited by memory.
> 
> Chris
> 
> 
>> On Mar 22, 2018, at 10:44 PM, John Sanda <john.sanda@gmail.com <mailto:john.sanda@gmail.com>>
wrote:
>> 
>> I have been doing some work on a cluster that is impacted by https://issues.apache.org/jira/browse/CASSANDRA-11363
<https://issues.apache.org/jira/browse/CASSANDRA-11363>. Reading through the ticket
prompted me to take a closer look at org.apache.cassandra.concurrent.SEPExecutor. I am looking
at the 3.0.14 code. I am a little confused about the Blocked and All Time Blocked columns
reported in nodetool tpstats and reported by StatusLogger. I understand that there is a queue
for tasks. In the case of RequestThreadPoolExecutor, the size of that queue can be controlled
via the cassandra.max_queued_native_transport_requests system property.
>> 
>> I have been looking at SEPExecutor.addTask(FutureTask<?> task), and here is
my question. If the queue is full, as defined by SEPExector.maxTasksQueued, are tasks rejected?
I do not fully grok the code, but it looks like it is possible for tasks to be rejected here
(some code and comments omitted for brevity):
>> 
>> public void addTask(FutureTask<?> task)
>> {
>>     tasks.add(task);
>>     ...
>>     else if (taskPermits >= maxTasksQueued) 
>>     {
>>         WaitQueue.Signal s = hasRoom.register();
>>         
>>         if (taskPermits(permits.get()) > maxTasksQueued)
>>         {
>>             if (takeWorkPermit(true))
>>                 pool.schedule(new Work(this))
>> 
>>             metrics.totalBlocked.inc();
>>             metrics.currentBlocked.inc();
>>             s.awaitUninterruptibly();
>>             metrics.currentBlocked.dec();
>>         }
>>         else
>>             s.cancel();
>>     }   
>> }
>> 
>> The first thing that happens is that the task is added to the tasks queue. pool.schedule()
only gets called if takeWorkPermit() returns true. I am still studying the code, but can someone
explain what exactly happens when the queue is full?
>> 
>> 
>> - John
> 
> 
> 
> 
> -- 
> 
> - John


Mime
View raw message