Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 54933 invoked from network); 5 Feb 2011 06:29:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 5 Feb 2011 06:29:23 -0000 Received: (qmail 66045 invoked by uid 500); 5 Feb 2011 06:29:20 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 65920 invoked by uid 500); 5 Feb 2011 06:29:17 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 65912 invoked by uid 99); 5 Feb 2011 06:29:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Feb 2011 06:29:16 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rajkumar.w93@gmail.com designates 209.85.161.44 as permitted sender) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 Feb 2011 06:29:09 +0000 Received: by fxm9 with SMTP id 9so3361237fxm.31 for ; Fri, 04 Feb 2011 22:28:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=yofVWhEXQR6B/OPWrCTcRwjkQ7WtXEOTpuDjzCtRJWg=; b=A90OEdKNfdvOxY/uPd1CXwYWX/y6vWYqNrwiMP/lf/ywg9bV0iNtcMDaH2gog2fGK1 b59jrH8Qlv3H/fenzbtK2Yg9QmLdA/GDRTOD2OgvLlrYTXzEs4x7/waySs0S79+6naAA rDcC+9HfqrZTkWcMsZsQ0liOIo6aHq07dmLmY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=LZ446qkwSc8aqGr6u2F4GHnpA/xKUrwNiAPyqE1PHuUvoufQTCoIBWsK62HbRZU9U5 lIEtrTbYIIyV3pcZCRW91eRT9ARgBAPNSbM/QKMZusjCZvGfx3wj/K457/L0egP7anLQ 7wi0Bv3jzhRr+lcmMtOBitBLkeUXzGys365AU= MIME-Version: 1.0 Received: by 10.223.107.66 with SMTP id a2mr12219626fap.92.1296887328955; Fri, 04 Feb 2011 22:28:48 -0800 (PST) Sender: rajkumar.w93@gmail.com Received: by 10.223.87.72 with HTTP; Fri, 4 Feb 2011 22:28:48 -0800 (PST) In-Reply-To: References: Date: Sat, 5 Feb 2011 11:58:48 +0530 X-Google-Sender-Auth: M1sLWXAsRB29JYxFUwrmxHKW840 Message-ID: Subject: Re: Merging the rows of two column families(with similar attributes) into one ?? From: Ertio Lew To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Thanks Tyler ! I could not fully understand the reason why more no of column families would mean more memory.. if you have under control parameters like memtable_throughput & memtable_operations which are set per column family basis then you can directly control & adjust by splitting the memory space between two CFs in proportion to what you would do in single CF. Hence there should be no extra memory consumption for multiple CFs that have been split from single one?? Regarding the compactions, I think even if they are more the size of the SST files to be compacted is smaller as the data has been split into two. Then more compactions but smaller too!! Then, provided the same amount of data, how can greater no of column families could be a bad option(if you split the values of parameters for memory consumption proportionately) ?? -- Regards, Ertio On Sat, Feb 5, 2011 at 10:43 AM, Tyler Hobbs wrote: > >> I read somewhere that more no of column families is not a good idea as >> it consumes more memory and more compactions to occur > > This is primarily true, but not in every case. > >> But the caching requirements may be different as they cater to two >> different features. > > This is a great reason to *not* merge them.=A0 Besides the key and row ca= ches, > don't forget about the OS buffer cache. > >> Is it recommended to merge these two column families into one ?? Thought= s >> ? > > No, this sounds like an anti-pattern to me.=A0 The overhead from having t= wo > separate CFs is not that high. > > -- > Tyler Hobbs > Software Engineer, DataStax > Maintainer of the pycassa Cassandra Python client library > >