From user-return-17454-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Jun 7 18:41:07 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 851F5678E for ; Tue, 7 Jun 2011 18:41:07 +0000 (UTC) Received: (qmail 51213 invoked by uid 500); 7 Jun 2011 18:41:05 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 51174 invoked by uid 500); 7 Jun 2011 18:41:05 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 51166 invoked by uid 99); 7 Jun 2011 18:41:05 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jun 2011 18:41:05 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [204.13.248.74] (HELO mho-02-ewr.mailhop.org) (204.13.248.74) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jun 2011 18:40:57 +0000 Received: from 71-218-75-109.hlrn.qwest.net ([71.218.75.109] helo=[192.168.0.2]) by mho-02-ewr.mailhop.org with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.72) (envelope-from ) id 1QU1Cd-000KKx-OB for user@cassandra.apache.org; Tue, 07 Jun 2011 18:40:35 +0000 X-Mail-Handler: MailHop Outbound by DynDNS X-Originating-IP: 71.218.75.109 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/mailhop/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX18gQuKoq1jW2BSviSwn5p3VOxZh0oLOXfQ= Message-ID: <4DEE709E.8000809@dude.podzone.net> Date: Tue, 07 Jun 2011 12:40:30 -0600 From: AJ User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: Backups, Snapshots, SSTable Data Files, Compaction References: <4DEDB132.4070100@dude.podzone.net> <4DEDB65C.203@datastax.com> <4DEDD135.6090800@dude.podzone.net> <4DEE4068.9090802@dude.podzone.net> <4DEE4ECD.1070407@datastax.com> In-Reply-To: <4DEE4ECD.1070407@datastax.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Thanks to everyone who responded thus far. On 6/7/2011 10:16 AM, Benjamin Coverston wrote: > Not to say that there aren't workloads where having many TB/Node > doesn't work, but if you're planning to read from the data you're > writing you do want to ensure that your working set is stored in memory. > Thank you Ben. Can you elaborate some more on the above point? Are you referring to the OS's working set or the Cassandra caches? Why exactly do I need to ensure this? I am also wondering if there is any reason I should segregate my frequently write/read smallish data set (such as usage statistics) from my bulk mostly read-only data set (static content) into separate CFs if the schema allows it. Would this be of any benefit?