Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 94962 invoked from network); 26 Jul 2010 21:05:02 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 26 Jul 2010 21:05:02 -0000 Received: (qmail 68527 invoked by uid 500); 26 Jul 2010 21:05:00 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 68478 invoked by uid 500); 26 Jul 2010 21:05:00 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 68470 invoked by uid 99); 26 Jul 2010 21:05:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Jul 2010 21:05:00 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Jul 2010 21:04:50 +0000 Received: by fxm1 with SMTP id 1so351788fxm.31 for ; Mon, 26 Jul 2010 14:04:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.40.6 with SMTP id s6mr1064254muj.111.1280178269659; Mon, 26 Jul 2010 14:04:29 -0700 (PDT) Sender: scode@scode.org Received: by 10.103.240.3 with HTTP; Mon, 26 Jul 2010 14:04:29 -0700 (PDT) X-Originating-IP: [213.114.156.79] In-Reply-To: References: Date: Mon, 26 Jul 2010 23:04:29 +0200 X-Google-Sender-Auth: imGvVu4OOR-S-6t80pO2a93fNQg Message-ID: Subject: Re: Key Caching From: Peter Schuller To: user@cassandra.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org > If the cache is stored in the heap, how big can the heap be made > realistically on a 24gb ram machine? I am a java newbie but I have read > concerns with going over 8gb for the heap as the GC can be too painful/take > too long. I already have seen timeout issues (node is dead errors) under > load during GC or compaction. Can/should the heap be set to 16gb with 24gb > ram? I have never run Cassandra in production with such a large heap, so I'll let others comment on practical experience with that. In general however, with the JVM and the CMS garbage collector (which is enabled by default with Cassandra), having a large heap is not necessarily a problem depending on the application's workload. In terms of GC:s taking too long - with the default throughput collector used by the JVM you will tend to see the longest pause times scale roughly linearly with heap size. Most pauses would still be short (these are what is known as young generation collections), but periodically a so-called full collection is done. WIth the throughput collector, this implies stopping all Java threads while the *entire* Java heap is garbage collected. WIth the CMS (Concurrent Mark/Sweep) collector the intent is that the periodic scans of the entire Java heap are done concurrently with the application without pausing it. Fallback to full stop-the-world garbage collections can still happen if CMS fails to complete such work fast enough, in which case tweaking of garbage collection settings may be required. One thing to consider in any case is how much memory you actually need; the more you give to the JVM, the less there is left for the OS to cache file contents. If for example your true working set in cassandra is, to grab a random number, 3 GB and you set the heap sizeto 15 GB - now you're wasting a lot of memory by allowing the JVM to postpone GC until it starts approaching the 15 GB mark. This is actually good (normally) for overall GC throughput, but not necessarily good overall for something like cassandra where there is a direct trade-off with cache eviction in the operating system possibly causing additional I/O. Personally I'd be very interested in hearing any stories about running cassandra nodes with 10+ gig heap sizes, and how well it has worked. My gut feeling is that it should work reasonable well, but I have no evidence of that and I may very well be wrong. Anyone? (On a related noted, my limited testing with the G1 collector with Cassandra has indicated it works pretty well. Though I'm concerned with the weak ref finalization based cleanup of compacted sstables since the G1 collector will be much less deterministic in when a particular object may be collected. Has anyone deployed Cassandra with G1 on very large heaps under real load?) -- / Peter Schuller