Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E076BDC47 for ; Wed, 22 Aug 2012 04:49:40 +0000 (UTC) Received: (qmail 75546 invoked by uid 500); 22 Aug 2012 04:49:38 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 75517 invoked by uid 500); 22 Aug 2012 04:49:37 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 75474 invoked by uid 99); 22 Aug 2012 04:49:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Aug 2012 04:49:36 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a80.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Aug 2012 04:49:29 +0000 Received: from homiemail-a80.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a80.g.dreamhost.com (Postfix) with ESMTP id AEEED37A06E for ; Tue, 21 Aug 2012 21:49:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; s= thelastpickle.com; bh=Qtp1PRs9BL2/0+GHJNhZVILn3fI=; b=0STn0UKnl1 VgtMAsssrOBUyGuiK3cPyfSgchIkVb0hEuw+BpPGiXHLMULa7Pwu+8WYx9WbQ48q Vlz3vYTfwLuY4NpvLNZ7g2j4noVIoG98Lx5g5058jcnfcF2u9W5GBhlYRkZuyyVu tGbeKiiOtukzShgdvpQu5ZyjCeFLxZ50g= Received: from [192.168.2.77] (unknown [116.90.132.105]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a80.g.dreamhost.com (Postfix) with ESMTPSA id 38D7037A06B for ; Tue, 21 Aug 2012 21:49:06 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1485\)) Subject: Re: Why the StageManager thread pools have 60 seconds keepalive time? From: aaron morton In-Reply-To: Date: Wed, 22 Aug 2012 16:49:03 +1200 Content-Transfer-Encoding: quoted-printable Message-Id: <9D5786F0-8646-40E4-828D-4ED69D68EFB1@thelastpickle.com> References: To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1485) > One thing we did change in the past weeks was the = memtable_flush_queue_size in order to occupy less heap space with = memtables, this was due to having received this warning message and some = OOM exceptions: Danger.=20 > Do you know any strategy to diagnose if memtables flushing to disk and = locking on the switchLock being the main cause of the dropped messages? = I've went through the source code but haven't seen any metrics reporting = on maybeSwitchMemtable blocking times. As a matter of fact I do :) Was the first thing in my cassandra Sf talk=20 = http://www.slideshare.net/aaronmorton/cassandra-sf-2012-technical-deep-div= e-query-performance/6 http://www.datastax.com/events/cassandrasummit2012/presentations If you reduce memtable_flush_queue_size to far writes will block. When = this happens you will see the MeteredFlusher say it want to flush X = cf's, but you will only see a few messages that say "Enqueuing flush of = =85" In a "FlushWriter-*" thread you will see the Memtable log "Writing=85" = when it starts flushing and "Completed flushing =85" when done. If the = MeteredFlusher is blocked it will immediately "Enqueuing flush of =85" = when the Memtable starts writing the next SStable.=20 Hope that helps.=20 ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/08/2012, at 6:38 AM, Guillermo Winkler = wrote: > Aaron, thanks for your answer.=20 >=20 > We do have big batch updates not always with the columns belonging to = the same row(i.e. many threads are needed to handle the updates), but it = did not not represented a problem when the CFs had less data on them. >=20 > One thing we did change in the past weeks was the = memtable_flush_queue_size in order to occupy less heap space with = memtables, this was due to having received this warning message and some = OOM exceptions: >=20 > logger.warn(String.format("Reducing %s capacity from %d to = %s to reduce memory pressure", > cacheType, getCapacity(), = newCapacity)); >=20 >=20 >=20 > Do you know any strategy to diagnose if memtables flushing to disk and = locking on the switchLock being the main cause of the dropped messages? = I've went through the source code but haven't seen any metrics reporting = on maybeSwitchMemtable blocking times. >=20 > Thanks again, > Guille >=20 > On Sun, Aug 19, 2012 at 5:21 AM, aaron morton = wrote: > Your seeing dropped mutations reported from nodetool tpstats ?=20 >=20 > Take a look at the logs. Look for messages from the MessagingService = with the pattern "{} {} messages dropped in last {}ms" They will be = followed by info about the TP stats. >=20 > First would be the workload. Are you sending very big batch_mutate or = multiget requests? Each row in the requests turns into a command in the = appropriate thread pool. This can result in other requests waiting a = long time for their commands to get processed.=20 >=20 > Next would be looking for GC and checking the = memtable_flush_queue_size is set high enough (check yaml for docs).=20 >=20 > After that I would look at winding concurrent_writers (and I assume = concurrent_readers) back. Anytime I see weirdness I look for config = changes and see what happens when they are returned to the default or = near default. Do you have 16 _physical_ cores? >=20 > Hope that helps.=20 > =20 > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com >=20 > On 18/08/2012, at 10:01 AM, Guillermo Winkler = wrote: >=20 >> Aaron, thanks for your answer. >>=20 >> I'm actually tracking a problem where mutations get dropped and = cfstats show no activity whatsoever, I have 100 threads for the mutation = pool, no running or pending tasks, but some mutations get dropped none = the less. >>=20 >> I'm thinking about some scheduling problems but not really sure yet. >>=20 >> Have you ever seen a case of dropped mutations with the system under = light load? >>=20 >> Thanks, >> Guille >>=20 >>=20 >> On Thu, Aug 16, 2012 at 8:22 PM, aaron morton = wrote: >> That's some pretty old code. I would guess it was done that way to = conserve resources. And _i think_ thread creation is pretty light = weight. >>=20 >> Jonathan / Brandon / others - opinions ?=20 >>=20 >> Cheers >>=20 >>=20 >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >>=20 >> On 17/08/2012, at 8:09 AM, Guillermo Winkler = wrote: >>=20 >>> Hi, I have a cassandra cluster where I'm seeing a lot of thread = trashing from the mutation pool. >>>=20 >>> MutationStage:72031 >>>=20 >>> Where threads get created and disposed in 100's batches every few = minutes, since it's a 16 core server concurrent_writes is set in 100 in = the cassandra.yaml.=20 >>>=20 >>> concurrent_writes: 100 >>>=20 >>> I've seen in the StageManager class this pools get created with 60 = seconds keepalive time. >>>=20 >>> DebuggableThreadPoolExecutor -> allowCoreThreadTimeOut(true); >>>=20 >>> StageManager-> public static final long KEEPALIVE =3D 60; // seconds = to keep "extra" threads alive for when idle >>>=20 >>> Is it a reason for it to be this way?=20 >>>=20 >>> Why not have a fixed size pool with Integer.MAX_VALUE as keepalive = since corePoolSize and maxPoolSize are set at the same size? >>>=20 >>> Thanks, >>> Guille >>>=20 >>=20 >>=20 >=20 >=20