Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A6566DD47 for ; Mon, 12 Nov 2012 20:58:31 +0000 (UTC) Received: (qmail 3971 invoked by uid 500); 12 Nov 2012 20:58:29 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 3941 invoked by uid 500); 12 Nov 2012 20:58:29 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 3933 invoked by uid 99); 12 Nov 2012 20:58:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Nov 2012 20:58:29 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a94.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Nov 2012 20:58:24 +0000 Received: from homiemail-a94.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a94.g.dreamhost.com (Postfix) with ESMTP id 3E52B38A071 for ; Mon, 12 Nov 2012 12:58:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; s= thelastpickle.com; bh=YveGCnPKJLqRbhJORrgL9LzL3rg=; b=e1a6F0wC8h NwrzGCJu0MXKWH7b6S3dwSh9/uqZv7Uib56jICPlBMYzYTUa+bLc1CGyv8SpmU2k bkjjduFlD/9fF+/PIp8K0/KFO2QRa0T/sMlYHLe2rvJsOzOJwqDcIeFKwG5OQ9F1 ITWGmAPDpykkXCu8vN9UF0tEkIPbpeWAw= Received: from [192.168.2.13] (unknown [116.90.132.105]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a94.g.dreamhost.com (Postfix) with ESMTPSA id DE1BD38A058 for ; Mon, 12 Nov 2012 12:58:03 -0800 (PST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Questions around the heap From: aaron morton In-Reply-To: Date: Tue, 13 Nov 2012 09:58:04 +1300 Content-Transfer-Encoding: quoted-printable Message-Id: <575AEFDF-12FD-4CEA-AD63-1F8B384A212B@thelastpickle.com> References: To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org For background, this thread discusses the working for cassandra = http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html tl;dr you can work it out or guess based on the tenured usage after CMS.=20= > How can we know how the heap is being used, monitor it ? My favourite is to turn on the gc logging in cassandra-env.sh=20 I can also recommend the GC coverage in this book = http://amzn.com/0137142528 You can also use JConsole or anything else that reads the JVM metrics = via JMX. > Why have I that much memory used in the heap of my new servers ? IMHO the m1.xlarge is the best EC2 node (apart from ssd) to use.=20 > I configured a 4G heap with a 200M "new size". That is a *very* low new heap size. I would expect it to result it = frequent premature promotion into the tenured heap. Which will make it = look like you are using more memory. > That is the heap that was supposed to be used. >=20 > Memtable : 1.4G (1/3 of the heap) > Key cache : 0.1G (min(5% of Heap (in MB), 100MB)) > System : 1G (more or less, from datastax doc) >=20 > So we are around 2.5G max in theory out of 3G usable (threshold 0.75 = of the heap before flushing memtable because of pressure) The memtable usage is the maxium value, if all the memtables are full = and the flush queue is full. It's not the working size used for = memtables. The code tries to avoid ever hitting the maximum.=20 Not sure if the 1G for "system" is still current or what it's actually = referring to. I suggest: * returning the configuration to the defaults. * if you have a high number of rows looking at the working set = calculations linked above. * monitoring the servers to look for triggers for the GC activity, such = as compaction or repair * looking at your code base for read queries that read a lot of data. = May be write but it's often read. * if you are using default compaction strategy, looking at the data = model rows that have a high number of deletes and or overwrites over a = longtime. These can have a high tombstone count.=20 GC activity is relative to the workload. Try to find things that cause a = lot of columns to be read from disk. I've found the following JVM tweeks sometimes helpful: MAX_HEAP_SIZE=3D"4G" HEAP_NEWSIZE=3D"1200M" SurvivorRatio=3D4 MaxTenuringThreshold=3D4 Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/11/2012, at 10:26 PM, Alain RODRIGUEZ wrote: > It's been Does anybody has an answer to any of these questions ? >=20 > Alain >=20 >=20 > 2012/11/7 Hiller, Dean > +1, I am interested in this answer as well. >=20 > From: Alain RODRIGUEZ > > Reply-To: = "user@cassandra.apache.org" = > > Date: Wednesday, November 7, 2012 9:45 AM > To: "user@cassandra.apache.org" = > > Subject: Re: Questions around the heap >=20 > s application that heavily scans a particular column family, you would = want to inhibit or disable the Bloom filter on the column family by = setting it high" >=20