From user-return-29938-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Nov 7 15:45:56 2012 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1299BD4EC for ; Wed, 7 Nov 2012 15:45:56 +0000 (UTC) Received: (qmail 25132 invoked by uid 500); 7 Nov 2012 15:45:53 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 25087 invoked by uid 500); 7 Nov 2012 15:45:53 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 25063 invoked by uid 99); 7 Nov 2012 15:45:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Nov 2012 15:45:52 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of arodrime@gmail.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vc0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Nov 2012 15:45:48 +0000 Received: by mail-vc0-f172.google.com with SMTP id fl11so1869441vcb.31 for ; Wed, 07 Nov 2012 07:45:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=LMGmpVQCPu/0cEuAYYHlL7cAKeHWzUubnR6Hfq+10uw=; b=IYyspeFyDJWbr/crxxy4hHPPzDRZ+l/aFCW3WRFk1RiTRo6wRaIjosuCZEJFx49MaV LISoNUfj3cj8AxfDVlGOFYNWeKHENgDJJSRGvgC5pF79h+Q/fSJU9Gvh9v62iWR6zB3I WSr1GSfrkc0t7CaRYbqagvxP+EOwuBArrTvtNRYtZwpKJBvvOC68n/+aD80ca+s5CqZD XZXr0BaXbbAbA3nLS1hh3pLuazvUNYPRohBtuIqZZDHOQ/mzqOczqg/+ZTJNh0pvTlfy eZGsUkRte4HyoQauc6zi8vpiMKhFanl+URjvP2DZXwhfbdpfMFOUGJhVqqYs/3wm1A8e WD8g== Received: by 10.52.22.72 with SMTP id b8mr3851346vdf.88.1352303127136; Wed, 07 Nov 2012 07:45:27 -0800 (PST) MIME-Version: 1.0 Received: by 10.220.249.3 with HTTP; Wed, 7 Nov 2012 07:45:06 -0800 (PST) In-Reply-To: References: From: Alain RODRIGUEZ Date: Wed, 7 Nov 2012 16:45:06 +0100 Message-ID: Subject: Re: Questions around the heap To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf307abca7858a7204cde99bb7 X-Virus-Checked: Checked by ClamAV on apache.org --20cf307abca7858a7204cde99bb7 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable I have to say that I have no idea on how to tune them. I discover the existence of bloom filters a few month ago and even after reading http://wiki.apache.org/cassandra/ArchitectureOverview#line-132 and http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html I am not sure what would be the impacts (positives and negatives) of tuning the bloom filters. >From my reads I understand that with a bloom_filter_fp_chance > 0 I introduce a chance to get a false positive from a SSTable inducing eventually more latency while answering queries but using less memory. Is that right ? "What are your bloom filter settings on your CFs?" They are default (0 - which seems to mean fully enabled http://www.datastax.com/docs/1.1/configuration/storage_configuration#bloom-= filter-fp-chance ) Cant they grow indefinitely or is there a threshold? Is there a way to "explore" the heap to be sure that bloom filters are causing this intensive use of the memory inside the heap before tuning them= ? >From http://www.datastax.com/docs/1.1/operations/tuning#tuning-bloomfilters : "For example, to run an analytics application that heavily scans a particular column family, you would want to inhibit or disable the Bloom filter on the column family by setting it high" Why would I do that, won't it slow the display of analytics? Alain 2012/11/7 Bryan > What are your bloom filter settings on your CFs? Maybe look here: > http://www.datastax.com/docs/1.1/operations/tuning#tuning-bloomfilters > > > > On Nov 7, 2012, at 4:56 AM, Alain RODRIGUEZ wrote: > > Hi, > > We just had some issue in production that we finally solve upgrading > hardware and increasing the heap. > > Now we have 3 xLarge servers from AWS (15G RAM, 4 cpu - 8 cores). We add > them and then removed the old ones. > > With full default configuration, 0.75 threshold of 4G was being reach > continuously, so I was obliged to increase the heap to 8G: > > Memtable : 2G (Manually configured) > Key cache : 0.1G (min(5% of Heap (in MB), 100MB)) > System : 1G (more or less, from datastax doc) > > It should use about 3 G and it actually use between 4 and 6 G. > > So here are my questions: > > How can we know how the heap is being used, monitor it ? > Why have I that much memory used in the heap of my new servers ? > > All configurations not specified are default from 1.1.2 Cassandra. > > Here is what happen to us before, why we change our hardware, if you have > any clue on what happen we would be glad to learn and maybe come back to > our old hardware. > > -------------------------------- User experience > ------------------------------------------------------------------------ > > We had a Cassandra 1.1.2 2 nodes cluster with RF2 and CL.ONE (R&W) runnin= g > on 2 m1.Large aws (7.5G RAM, 2 cpu - 4 cores dedicated to Cassandra only)= . > > Cassandra.yaml was configured with 1.1.2 default options and in > cassandra-env.sh I configured a 4G heap with a 200M "new size". > > That is the heap that was supposed to be used. > > Memtable : 1.4G (1/3 of the heap) > Key cache : 0.1G (min(5% of Heap (in MB), 100MB)) > System : 1G (more or less, from datastax doc) > > So we are around 2.5G max in theory out of 3G usable (threshold 0.75 of > the heap before flushing memtable because of pressure) > > I thought it was ok regarding Datastax documentation: > > "Regardless of how much RAM your hardware has, you should keep the JVM > heap size constrained by the following formula and allow the operating > system=92s file cache to do the rest: > > (memtable_total_space_in_mb) + 1GB + (cache_size_estimate)" > After adding a third node and changing the RF from 2 to 3 (to allow using > CL.QUORUM and still be able to restart a node whenever we want), things > went really bad. Even if I still don't get how any of these operations > could possibly affect the heap needed. > > All the 3 nodes reached the 0.75 heap threshold (I tried to increase it t= o > 0.85, but it was still reached). And they never came down. So my cluster > started flushing a lot and the load increased because of > unceasing compactions. This unexpected load produced latency that broke > down our service for a while. Even with the service down, Cassandra was > unable to recover. > > > --20cf307abca7858a7204cde99bb7 Content-Type: text/html; charset=windows-1252 Content-Transfer-Encoding: quoted-printable
I have to say that I have no idea on how to tune them.
<= br>
I discover the existence of bloom filters a few month ago and= even after reading=A0http://wiki.apache.org/cassandra/Ar= chitectureOverview#line-132 and http://spy= ced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html I am no= t sure what would be the impacts (positives and negatives) of tuning the bl= oom filters.

From my reads I understand that with a=A0bloom_filter_fp_chance > 0 I introduce a chance to get = a false positive from a SSTable inducing eventually more latency while answ= ering queries but using less memory. Is that right ?

"What are your bloom filter settings on your CFs?"

They are default (0 - which seems to mean fully enab= led=A0http://www.datastax.co= m/docs/1.1/configuration/storage_configuration#bloom-filter-fp-chance)= =A0

Cant they grow=A0indefinitely or is there a thres= hold?

Is there a way to "explore" the he= ap to be sure that bloom filters are causing this intensive use of the memo= ry inside the heap before tuning them?


"For example, to run an analytics application that heavily= scans a particular column family, you would want to inhibit or disable the= Bloom filter on the column family by setting it high"

Why would I do tha= t, won't it slow the display of analytics?

Alain


2012/11/7 Bry= an <bryan@appssavvy.com>
What are your bloom filter settings on = your CFs? Maybe look here:=A0http://www.datastax.= com/docs/1.1/operations/tuning#tuning-bloomfilters



On Nov 7, 2012, at 4:= 56 AM, Alain RODRIGUEZ wrote:

Hi,
We just had some issue in production that we finally solve upg= rading hardware and increasing the heap.

Now we have 3 xLarge servers from AWS (15G RAM, 4 cpu -= 8 cores). We add them and then removed the old ones.

With full default configuration,=A00.75 threshold= of 4G was being reach continuously, so=A0I was obliged to increase the hea= p to 8G:

Memtable =A0: 2G (Manually configure= d)
Key cache : 0.1G (min(5% of Heap (in MB), 100MB))
System =A0= =A0 : 1G =A0 =A0 (more or less, from datastax doc)

It should use about 3 G and it actually use between 4 and 6 G.

So here are my questions:

How can w= e know how the heap is being used, monitor it ?
Why have I that m= uch memory used in the heap of my new servers ?

All configurations not specified are default from 1.1.2 Cassandra.

Here is what happen to us before, why we change our hardwa= re, if you have any clue on what happen we would be glad to learn and maybe= come back to our old hardware.

-------------------------------- User experience -= -----------------------------------------------------------------------

We had a Cassandra 1.1.2 2 nodes cluster with RF2 and= CL.ONE (R&W) running on 2 m1.Large aws=A0(7.5G RAM, 2 cpu - 4 cores=A0= dedicated to Cassandra only).=A0

Cassandra.yaml was configured with 1.1.2 default option= s and in cassandra-env.sh I configured a 4G heap with a 200M "new size= ".

That is the heap that was supposed to be u= sed.

Memtable =A0: 1.4G (1/3 of the heap)
Ke= y cache : 0.1G (min(5% of Heap (in MB), 100MB))
System =A0 =A0 : = 1G =A0 =A0 (more or less, from datastax doc)

So we= are around 2.5G max in theory out of 3G usable (threshold 0.75 of the heap= before flushing memtable because of pressure)

I thought it was ok regarding Datastax documentation:

"Regardless of = how much RAM your hardware has, you should keep the JVM heap size constrain= ed by the following formula and allow the operating system=92s file cache t= o do the rest:

(memtable_total_space_in_mb) + 1GB + (cache_size_estima= te)"

After adding a third node and changing the RF from = 2 to 3 (to=A0allow using CL.QUORUM and still be able to restart a node when= ever we want), things went really bad. Even if I still don't get how an= y of these operations could possibly affect the heap needed.

All the 3 nodes reached the 0.75 heap threshold (= I tried to increase it to 0.85, but it was still reached). And they never c= ame down. So my cluster started flushing a lot and the load increased becau= se of unceasing=A0compactions. This unexpected load produced latency that b= roke down our service for a while. Even with the service down, Cassandra wa= s unable to recover.



--20cf307abca7858a7204cde99bb7--