Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B0F8D106CD for ; Thu, 3 Oct 2013 13:03:04 +0000 (UTC) Received: (qmail 65313 invoked by uid 500); 3 Oct 2013 13:03:01 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 65294 invoked by uid 500); 3 Oct 2013 13:03:01 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 65240 invoked by uid 99); 3 Oct 2013 13:03:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Oct 2013 13:03:00 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of comomore@gmail.com designates 209.85.220.173 as permitted sender) Received: from [209.85.220.173] (HELO mail-vc0-f173.google.com) (209.85.220.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Oct 2013 13:02:56 +0000 Received: by mail-vc0-f173.google.com with SMTP id if17so934505vcb.4 for ; Thu, 03 Oct 2013 06:02:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=oMdlNXKiQHeJPll7pqlSdXD5tuSEdTvZlknvPetJd3Q=; b=HfVoc2S0TBgxmKn8hRtVCf7DIacH/PTS+XVZ0JngHoIkvjpg1VnBjWPhlTfhEvaYuT 2wSIYjmogYtNIWXfYuTZwHgHzqoQr7PWbnoiR2BjQGX9umZId+QQyuJvNXGFYZAdfz67 cjDbkHEVmSue3evf7174IXL5fnrg3IOF6AsQYt0HyrJMCl1o0tFfnzasQszqHd7AEIFO PK1okfDXoJcKvbZ+vs5Gh+7YUe8G5FuYndvTsYZmMlbANaTFrRfQuDiOR8yikLtdM9Dg E7p6MarkNTt8wrjeYZTVyjXsmLQJe4up01V/UAA4kPv0jM1yKsiRiMsQPV39uuQ4JvJ2 fOTA== MIME-Version: 1.0 X-Received: by 10.52.74.100 with SMTP id s4mr38886vdv.35.1380805355478; Thu, 03 Oct 2013 06:02:35 -0700 (PDT) Received: by 10.220.242.82 with HTTP; Thu, 3 Oct 2013 06:02:35 -0700 (PDT) In-Reply-To: <524D60F4.4030409@opera.com> References: <524D60F4.4030409@opera.com> Date: Thu, 3 Oct 2013 08:02:35 -0500 Message-ID: Subject: Re: Cassandra Heap Size for data more than 1 TB From: srmore To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=bcaec5016625b7e49504e7d5cc96 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec5016625b7e49504e7d5cc96 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Mohit and Michael, That's what I thought. I have tried all the avenues, will give ParNew a try. With the 1.0.xx I have issues when data sizes go up, hopefully that will not be the case with 1.2. Just curious, has anyone tried 1.2 with large data set, around 1 TB ? Thanks ! On Thu, Oct 3, 2013 at 7:20 AM, Micha=C5=82 Michalski w= rote: > I was experimenting with 128 vs. 512 some time ago and I was unable to se= e > any difference in terms of performance. I'd probably check 1024 too, but = we > migrated to 1.2 and heap space was not an issue anymore. > > M. > > W dniu 02.10.2013 16:32, srmore pisze: > > I changed my index_interval from 128 to index_interval: 128 to 512, does >> it >> make sense to increase more than this ? >> >> >> On Wed, Oct 2, 2013 at 9:30 AM, cem wrote: >> >> Have a look to index_interval. >>> >>> Cem. >>> >>> >>> On Wed, Oct 2, 2013 at 2:25 PM, srmore wrote: >>> >>> The version of Cassandra I am using is 1.0.11, we are migrating to 1.2= .X >>>> though. We had tuned bloom filters (0.1) and AFAIK making it lower tha= n >>>> this won't matter. >>>> >>>> Thanks ! >>>> >>>> >>>> On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia >>> >wrote: >>>> >>>> Which Cassandra version are you on? Essentially heap size is function >>>>> of >>>>> number of keys/metadata. In Cassandra 1.2 lot of the metadata like >>>>> bloom >>>>> filters were moved off heap. >>>>> >>>>> >>>>> On Tue, Oct 1, 2013 at 9:34 PM, srmore wrote: >>>>> >>>>> Does anyone know what would roughly be the heap size for cassandra >>>>>> with >>>>>> 1TB of data ? We started with about 200 G and now on one of the node= s >>>>>> we >>>>>> are already on 1 TB. We were using 8G of heap and that served us wel= l >>>>>> up >>>>>> until we reached 700 G where we started seeing failures and nodes >>>>>> flipping. >>>>>> >>>>>> With 1 TB of data the node refuses to come back due to lack of memor= y. >>>>>> needless to say repairs and compactions takes a lot of time. We uppe= d >>>>>> the >>>>>> heap from 8 G to 12 G and suddenly everything started moving rapidly >>>>>> i.e. >>>>>> the repair tasks and the compaction tasks. But soon (in about 9-10 >>>>>> hrs) we >>>>>> started seeing the same symptoms as we were seeing with 8 G. >>>>>> >>>>>> So my question is how do I determine what is the optimal size of hea= p >>>>>> for data around 1 TB ? >>>>>> >>>>>> Following are some of my JVM settings >>>>>> >>>>>> -Xms8G >>>>>> -Xmx8G >>>>>> -Xmn800m >>>>>> -XX:NewSize=3D1200M >>>>>> XX:MaxTenuringThreshold=3D2 >>>>>> -XX:SurvivorRatio=3D4 >>>>>> >>>>>> Thanks ! >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> > --bcaec5016625b7e49504e7d5cc96 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks Mohit and Michael,
That's what I t= hought. I have tried all the avenues, will give ParNew a try. With the 1.0.= xx I have issues when data sizes go up, hopefully that will not be the case= with 1.2.

Just curious, has anyone tried 1.2 with large data set, around 1 = TB ?


Thanks !

On Thu, Oct 3, 2013 at 7:20 AM, Micha=C5=82 Mic= halski <michalm@opera.com> wrote:
I was experimenting with 128 vs. 512 some ti= me ago and I was unable to see any difference in terms of performance. I= 9;d probably check 1024 too, but we migrated to 1.2 and heap space was not = an issue anymore.

M.

W dniu 02.10.2013 16:32, srmore pisze:

I changed my index_interval from 128 to index_interval: 128 to 512, does it=
make sense to increase more than this ?


On Wed, Oct 2, 2013 at 9:30 AM, cem <cayiroglu@gmail.com> wrote:

Have a look to index_interval.

Cem.


On Wed, Oct 2, 2013 at 2:25 PM, srmore <comomore@gmail.com> wrote:

The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than
this won't matter.

Thanks !


On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:
Which Cassandra version are you on? Essentially heap size is function of number of keys/metadata. In Cassandra 1.2 lot of the metadata like bloom filters were moved off heap.


On Tue, Oct 1, 2013 at 9:34 PM, srmore <comomore@gmail.com> wrote:

Does anyone know what would roughly be the heap size for cassandra with
1TB of data ? We started with about 200 G and now on one of the nodes we are already on 1 TB. We were using 8G of heap and that served us well up until we reached 700 G where we started seeing failures and nodes flipping.=

With 1 TB of data the node refuses to come back due to lack of memory.
needless to say repairs and compactions takes a lot of time. We upped the heap from 8 G to 12 G and suddenly everything started moving rapidly i.e. the repair tasks and the compaction tasks. But soon (in about 9-10 hrs) we<= br> started seeing the same symptoms as we were seeing with 8 G.

So my question is how do I determine what is the optimal size of heap
for data around 1 TB ?

Following are some of my JVM settings

-Xms8G
-Xmx8G
-Xmn800m
-XX:NewSize=3D1200M
XX:MaxTenuringThreshold=3D2
-XX:SurvivorRatio=3D4

Thanks !








--bcaec5016625b7e49504e7d5cc96--