From user-return-20563-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Sep 7 17:51:33 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 682CB8D4B for ; Wed, 7 Sep 2011 17:51:33 +0000 (UTC) Received: (qmail 20375 invoked by uid 500); 7 Sep 2011 17:51:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 20222 invoked by uid 500); 7 Sep 2011 17:51:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 20214 invoked by uid 99); 7 Sep 2011 17:51:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Sep 2011 17:51:30 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adi.pandit@gmail.com designates 74.125.83.42 as permitted sender) Received: from [74.125.83.42] (HELO mail-gw0-f42.google.com) (74.125.83.42) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Sep 2011 17:51:25 +0000 Received: by gwb17 with SMTP id 17so5120667gwb.29 for ; Wed, 07 Sep 2011 10:51:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=V6UrlMB953nrHlgg5qSU7gOs9Thm6IEW9mybleuPuMM=; b=KjorKl7phfSMcar56u4SzhN4IXqo30jAzUxU66ZlQOee+9AgAHRFuVGy8ckWiuISGU 9kXeP25OuODFMWphPJCFasUY1oEtEwUMKRHMFsQlSs0dMojvck4LfLnR9MiAP7KjbTg5 8OXfIZ2tWwJ9x1kpKa+7rd+aQPveiJBP1bPvg= MIME-Version: 1.0 Received: by 10.147.55.30 with SMTP id h30mr978785yak.22.1315417864975; Wed, 07 Sep 2011 10:51:04 -0700 (PDT) Received: by 10.147.167.18 with HTTP; Wed, 7 Sep 2011 10:51:04 -0700 (PDT) In-Reply-To: <51889EA5-9480-4565-88EE-2F9BD7B1ED36@rhapsody.com> References: <18A410A9-951F-4002-B188-DFA5C421D919@rhapsody.com> <51889EA5-9480-4565-88EE-2F9BD7B1ED36@rhapsody.com> Date: Wed, 7 Sep 2011 13:51:04 -0400 Message-ID: Subject: Re: Calculate number of nodes required based on data From: Adi To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=000e0cd3a28a92b4d704ac5d96f1 --000e0cd3a28a92b4d704ac5d96f1 Content-Type: text/plain; charset=ISO-8859-1 On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan wrote: > Adi, > > The reason we're attempting to add more nodes is trying to solve the > long/simultaneous compactions, i.e. the performance issue, not the storage > issue yet. > We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, > and when 4 nodes doing compaction at the same period, we're screwed, > especially on read, since it'll cover one of the compaction node anyways. > My assumption is that if we add more nodes, each node will have less load, > and therefore need less compaction, and probably will compact faster, > eternally avoid 4+ nodes doing compaction simultaneously. > > Any suggestion on how to calculate how many more nodes to add? Or, > generally how to plan for number of nodes required, from a performance > perspective? > > Thanks, > Hefeng > > > Adding nodes to delay and reduce compaction is an interesting performance use case :-) I am thinking you can find a smarter/cheaper way to manage that. Have you looked at a) increasing memtable througput What is the nature of your writes? Is it mostly inserts or also has lot of quick updates of recently inserted data. Increasing memtable_throughput can delay and maybe reduce the compaction cost if you have lots of updates to same data.You will have to provide for memory if you try this. When mentioned "with ~9m serialized bytes" is that the memtable throughput? That is quite a low threshold which will result in large number of SSTables needing to be compacted. I think the default is 256 MB and on the lower end values I have seen are 64 MB or maybe 32 MB. b) tweaking min_compaction_threshold and max_compaction_threshold - increasing min_compaction_threshold will delay compactions - decreasing max_compaction_threshold will reduce number of sstables per compaction cycle Are you using the defaults 4-32 or are trying some different values c) splitting column families Again splitting column families can also help because compactions occur serially one CF at a time and that spreads out your compaction cost over time and column families. It requires change in app logic though. -Adi --000e0cd3a28a92b4d704ac5d96f1 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan = <hfyuan@rhapsod= y.com> wrote:
Adi,

The reason we&#= 39;re attempting to add more nodes is trying to solve the long/simultaneous= compactions, i.e. the performance issue, not the storage issue yet.
We have RF 5 and CL QUORUM for read and write, we have currently 6 nod= es, and when 4 nodes doing compaction at the same period, we're screwed= , especially on read, since it'll cover one of the compaction node anyw= ays.=A0
My assumption is that if we add more nodes, each node will have less l= oad, and therefore need less compaction, and probably will compact faster, = eternally avoid 4+ nodes doing compaction simultaneously.

Any suggestion on how to calculate how many more nodes to add? O= r, generally how to plan for number of nodes required, from a performance p= erspective?

Thanks,
Hefeng



Adding nodes to delay and red= uce compaction is an interesting performance use case :-) =A0I am thinking = you can find a smarter/cheaper way to manage that.
Have you looked at=A0
a) increasing memtable througput
=
What is the nature of your writes? =A0Is it mostly inserts or also has= lot of quick updates of recently inserted data. Increasing memtable_throug= hput can delay and maybe reduce the compaction cost if you have lots of upd= ates to same data.You will have to provide for memory if you try this.=A0
When mentioned "with ~9m serialized bytes" is that the = memtable throughput? That is quite a low threshold which will result in lar= ge number of SSTables needing to be compacted. I think the default is 256 M= B and on the lower end values I have seen are 64 MB or maybe 32 MB.<= /div>


b) tweaking=A0min_compaction_threshold a= nd=A0max_compaction_threshold
- increasing=A0min_compaction_thres= hold will delay compactions
- decreasing=A0max_compaction_thresho= ld will reduce number of sstables per compaction cycle
Are you using the defaults 4-32 or are trying some different values

c) splitting column families
Again sp= litting column families can also help because compactions occur serially on= e CF at a time and that spreads out your compaction cost over time and colu= mn families. It requires change in app logic though.

-Adi

--000e0cd3a28a92b4d704ac5d96f1--