incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Why the StageManager thread pools have 60 seconds keepalive time?
Date Wed, 22 Aug 2012 04:49:03 GMT
> One thing we did change in the past weeks was the memtable_flush_queue_size in order to
occupy less heap space with memtables, this was due to having received this warning message
and some OOM exceptions:
Danger. 

> Do you know any strategy to diagnose if memtables flushing to disk and locking on the
switchLock being the main cause of the dropped messages? I've went through the source code
but haven't seen any metrics reporting on maybeSwitchMemtable blocking times.
As a matter of fact I do :)

Was the first thing in my cassandra Sf talk 
http://www.slideshare.net/aaronmorton/cassandra-sf-2012-technical-deep-dive-query-performance/6
http://www.datastax.com/events/cassandrasummit2012/presentations

If you reduce memtable_flush_queue_size to far writes will block. When this happens you will
see the MeteredFlusher say it want to flush X cf's, but you will only see a few messages that
say "Enqueuing flush of …"

In a "FlushWriter-*" thread you will see the Memtable log "Writing…" when it starts flushing
and "Completed flushing …" when done. If the MeteredFlusher is blocked it will immediately
"Enqueuing flush of …" when the Memtable starts writing the next SStable. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/08/2012, at 6:38 AM, Guillermo Winkler <gwinkler@inconcertcc.com> wrote:

> Aaron, thanks for your answer. 
> 
> We do have big batch updates not always with the columns belonging to the same row(i.e.
many threads are needed to handle the updates), but it did not not represented a problem when
the CFs had less data on them.
> 
> One thing we did change in the past weeks was the memtable_flush_queue_size in order
to occupy less heap space with memtables, this was due to having received this warning message
and some OOM exceptions:
> 
>             logger.warn(String.format("Reducing %s capacity from %d to %s to reduce memory
pressure",
>                                       cacheType, getCapacity(), newCapacity));
> 
> 
> 
> Do you know any strategy to diagnose if memtables flushing to disk and locking on the
switchLock being the main cause of the dropped messages? I've went through the source code
but haven't seen any metrics reporting on maybeSwitchMemtable blocking times.
> 
> Thanks again,
> Guille
> 
> On Sun, Aug 19, 2012 at 5:21 AM, aaron morton <aaron@thelastpickle.com> wrote:
> Your seeing dropped mutations reported from nodetool tpstats ? 
> 
> Take a look at the logs. Look for messages from the MessagingService with the pattern
"{} {} messages dropped in last {}ms" They will be followed by info about the TP stats.
> 
> First would be the workload. Are you sending very big batch_mutate or multiget requests?
Each row in the requests turns into a command in the appropriate thread pool. This can result
in other requests waiting a long time for their commands to get processed. 
> 
> Next would be looking for GC and checking the memtable_flush_queue_size is set high enough
(check yaml for docs). 
> 
> After that I would look at winding  concurrent_writers (and I assume concurrent_readers)
back. Anytime I see weirdness I look for config changes and see what happens when they are
returned to the default or near default.  Do you have 16 _physical_ cores?
> 
> Hope that helps. 
>   
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 18/08/2012, at 10:01 AM, Guillermo Winkler <gwinkler@inconcertcc.com> wrote:
> 
>> Aaron, thanks for your answer.
>> 
>> I'm actually tracking a problem where mutations get dropped and cfstats show no activity
whatsoever, I have 100 threads for the mutation pool, no running or pending tasks, but some
mutations get dropped none the less.
>> 
>> I'm thinking about some scheduling problems but not really sure yet.
>> 
>> Have you ever seen a case of dropped mutations with the system under light load?
>> 
>> Thanks,
>> Guille
>> 
>> 
>> On Thu, Aug 16, 2012 at 8:22 PM, aaron morton <aaron@thelastpickle.com> wrote:
>> That's some pretty old code. I would guess it was done that way to conserve resources.
And _i think_ thread creation is pretty light weight.
>> 
>> Jonathan / Brandon / others - opinions ? 
>> 
>> Cheers
>> 
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 17/08/2012, at 8:09 AM, Guillermo Winkler <gwinkler@inconcertcc.com> wrote:
>> 
>>> Hi, I have a cassandra cluster where I'm seeing a lot of thread trashing from
the mutation pool.
>>> 
>>> MutationStage:72031
>>> 
>>> Where threads get created and disposed in 100's batches every few minutes, since
it's a 16 core server concurrent_writes is set in 100 in the cassandra.yaml. 
>>> 
>>> concurrent_writes: 100
>>> 
>>> I've seen in the StageManager class this pools get created with 60 seconds keepalive
time.
>>> 
>>> DebuggableThreadPoolExecutor -> allowCoreThreadTimeOut(true);
>>> 
>>> StageManager-> public static final long KEEPALIVE = 60; // seconds to keep
"extra" threads alive for when idle
>>> 
>>> Is it a reason for it to be this way? 
>>> 
>>> Why not have a fixed size pool with Integer.MAX_VALUE as keepalive since corePoolSize
and maxPoolSize are set at the same size?
>>> 
>>> Thanks,
>>> Guille
>>> 
>> 
>> 
> 
> 


Mime
View raw message