Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 28446 invoked from network); 4 May 2010 20:47:31 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 4 May 2010 20:47:31 -0000 Received: (qmail 46108 invoked by uid 500); 4 May 2010 20:47:30 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 46090 invoked by uid 500); 4 May 2010 20:47:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 46082 invoked by uid 99); 4 May 2010 20:47:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 May 2010 20:47:30 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of weijunli@gmail.com designates 209.85.222.181 as permitted sender) Received: from [209.85.222.181] (HELO mail-pz0-f181.google.com) (209.85.222.181) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 May 2010 20:47:23 +0000 Received: by pzk11 with SMTP id 11so2168435pzk.28 for ; Tue, 04 May 2010 13:47:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=KQ92k6a6hcY84+C+bmpwoBo+H3+CJoadfmdtt9qoLF0=; b=b9MUGvyvrYuNwn764TYs2o20AQU8Cv+E63Kfb2TfjH7RzwEH7UnI45EbUaNOhCgHu8 dt5sC9zKT/sThSfPlUjvRD1vVKTkQ/mQCRFql0nrQ/d+CPT3JkcGoGiYoyJES/P2KgFV PnHsGQdq4pCzw3HVBTUYiCbkIa/StyRqaFKTU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=ihr2x3aTdq9gAuqVnwaIe7uvI00G+a6hvdhaD4x1xA9s4islmdAWDRm2755CAM1k+z wnCTDSzl06ty95VekEQfe9PaNuBItwOvnITMP4Rkzjv40x4h7Uc1O49/lM304dIaCCLn VvovLVahh3GixYkFt0QJL3ZY3PXOkOhwprmMs= MIME-Version: 1.0 Received: by 10.140.57.15 with SMTP id f15mr5091331rva.56.1273006023517; Tue, 04 May 2010 13:47:03 -0700 (PDT) Received: by 10.140.126.19 with HTTP; Tue, 4 May 2010 13:47:03 -0700 (PDT) Date: Tue, 4 May 2010 13:47:03 -0700 Message-ID: Subject: BloomFilter is taking too much memory From: Weijun Li To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001636b2ac69d42a750485cacf9d --001636b2ac69d42a750485cacf9d Content-Type: text/plain; charset=ISO-8859-1 Hello, We stored about 47mil keys in one Cassandra node and what a memory dump shows for one of the SStableReader: SSTableReader: 386MB. Among this 386MB, IndexSummary takes about 231MB but BloomFilter takes 155MB with an embedded huge array long[19.4mil]. It seems that BloomFilter is taking too much memory. If this is the case BloomFilter seems to be redundant comparing to the size of index. So is this desired behavior? Is there a formula to estimate the size of needed memory for BloomFilter? Thanks, -Weijun --001636b2ac69d42a750485cacf9d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hello,

We stored about 47mil keys in one Cassandra node and what a = memory dump shows for one of the SStableReader:

=A0=A0=A0 SSTableRea= der: 386MB. Among this 386MB, IndexSummary takes about 231MB but BloomFilte= r takes 155MB with an embedded huge array long[19.4mil].

It seems that BloomFilter is taking too much memory. If this is the cas= e BloomFilter seems to be redundant comparing to the size of index.
So is this desired behavior? Is there a formula to estimate the size of ne= eded memory for BloomFilter?

Thanks,

-Weijun

--001636b2ac69d42a750485cacf9d--