Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AFCF5EC1E for ; Mon, 25 Feb 2013 13:27:20 +0000 (UTC) Received: (qmail 34937 invoked by uid 500); 25 Feb 2013 13:27:18 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 34910 invoked by uid 500); 25 Feb 2013 13:27:17 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 34894 invoked by uid 99); 25 Feb 2013 13:27:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Feb 2013 13:27:17 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.174.58.134] (HELO XEDGEA.nrel.gov) (192.174.58.134) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Feb 2013 13:27:13 +0000 Received: from XHUBA.nrel.gov (10.20.4.58) by XEDGEA.nrel.gov (192.174.58.134) with Microsoft SMTP Server (TLS) id 8.3.245.1; Mon, 25 Feb 2013 06:26:44 -0700 Received: from MAILBOX2.nrel.gov ([fe80::19a0:6c19:6421:12f]) by XHUBA.nrel.gov ([::1]) with mapi; Mon, 25 Feb 2013 06:26:52 -0700 From: "Hiller, Dean" To: "user@cassandra.apache.org" Date: Mon, 25 Feb 2013 06:27:05 -0700 Subject: Re: disabling bloomfilter not working? or did I do this wrong? Thread-Topic: disabling bloomfilter not working? or did I do this wrong? Thread-Index: Ac4TW8Vg1crIHYNIR3WjBVgReUdnuA== Message-ID: In-Reply-To: <6DFA7712-EBC4-4AD6-8FE1-18A80702D4F4@thelastpickle.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.2.5.121010 acceptlanguage: en-US Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hmmmm, I thought bloomfilters only help on missing rows. Any time we look = up a row, we know it is there in our case as it would not be in the other t= able. I would say statistically 99.9% of the time the row is there and we = are okay with 0.1% of the time wasting hitting the disk. Do I have this correct though? Bloomfilters really only help me if the dat= a is not there so I don't have to go to the disk and find that out. Thanks, Dean From: aaron morton = > Reply-To: "user@cassandra.apache.org" > Date: Sunday, February 24, 2013 7:09 PM To: "user@cassandra.apache.org" > Subject: Re: disabling bloomfilter not working? or did I do this wrong? Yeah, disabling completely is probably not great. There is some wriggle room between disabled and "less memory" Did I link to this bloom filter calculator ? http://hur.st/bloomfilter also= https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassand= ra/utils/BloomCalculations.java Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 23/02/2013, at 12:10 PM, Bryan Talbot > wrote: I see from your read and write count that your nreldata CF has nearly equal= number of reads as writes. I would expect that disabling your bloom filte= r is going to hurt your read performance quite a bit. Also, beware that disabling your bloom filter may also cause tombstoned row= s to never be deleted, so if you delete all columns explicitly or use TTL, = your data may grow more than your expect. https://issues.apache.org/jira/b= rowse/CASSANDRA-5182 -Bryan On Fri, Feb 22, 2013 at 11:59 AM, Hiller, Dean > wrote: Thanks, but I found out it is still running. It looks like I have about a = 5 hour wait left for my upgradesstables(waited 4 hours already). I will ch= eck the bloomfilter after that. Out of curiosity, if I had much wider rows (ie. < 900k) per row, will compa= ction run faster(errrr=85upgradesstables) at all or would it basically run = at the same speed. I guess what I am wondering is 9 hours a normal compaction time for 130gb o= f data? Thanks, Dean From: aaron morton = >> Reply-To: "user@cassandra.apache.org>" >> Date: Friday, February 22, 2013 10:29 AM To: "user@cassandra.apache.org>" >> Subject: Re: disabling bloomfilter not working? or did I do this wrong? Bloom Filter Space Used: 2318392048 Just to be sane do a quick check of the -Filter.db files on disk for this C= F. If they are very small try a restart on the node. Number of Keys (estimate): 1249133696 Hey a billion rows on a node, what an age we live in :) Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 23/02/2013, at 4:35 AM, "Hiller, Dean" >= > wrote: So in the cli, I ran update column family nreldata with bloom_filter_fp_chance=3D1.0; Then I ran nodetool upgradesstables databus5 nreldata; But my bloom filter size is still around 2gig(and I want to free up this he= ap)!!!! According to nodetool cfstats command=85 Column Family: nreldata SSTable count: 10 Space used (live): 96841497731 Space used (total): 96841497731 Number of Keys (estimate): 1249133696 Memtable Columns Count: 7066 Memtable Data Size: 4286174 Memtable Switch Count: 924 Read Count: 19087150 Read Latency: 0.595 ms. Write Count: 21281994 Write Latency: 0.013 ms. Pending Tasks: 0 Bloom Filter False Postives: 974393 Bloom Filter False Ratio: 0.99998 Bloom Filter Space Used: 2318392048 Compacted row minimum size: 73 Compacted row maximum size: 446 Compacted row mean size: 143