Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 86B8110C0C for ; Tue, 18 Jun 2013 08:23:00 +0000 (UTC) Received: (qmail 93021 invoked by uid 500); 18 Jun 2013 08:22:56 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 92990 invoked by uid 500); 18 Jun 2013 08:22:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 92982 invoked by uid 99); 18 Jun 2013 08:22:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Jun 2013 08:22:53 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of agundabattula@threatmetrix.com designates 98.129.35.9 as permitted sender) Received: from [98.129.35.9] (HELO server505.appriver.com) (98.129.35.9) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Jun 2013 08:22:47 +0000 X-Note-AR-ScanTimeLocal: 6/18/2013 3:22:25 AM X-Policy: GLOBAL - threatmetrix.com X-Primary: agundabattula@threatmetrix.com X-Note: This Email was scanned by AppRiver SecureTide X-ALLOW: @threatmetrix.com ALLOWED X-Virus-Scan: V- X-Note: Spam Tests Failed: X-Country-Path: UNKNOWN->UNITED STATES->UNITED STATES X-Note-Sending-IP: 98.129.35.1 X-Note-Reverse-DNS: smtp.exg5.exghost.com X-Note-Return-Path: agundabattula@threatmetrix.com X-Note: User Rule Hits: X-Note: Global Rule Hits: G319 G320 G321 G322 G326 G327 G338 G434 X-Note: Encrypt Rule Hits: X-Note: Mail Class: ALLOWEDSENDER X-Note: Headers Injected Received: from [98.129.35.1] (HELO smtp.exg5.exghost.com) by server505.appriver.com (CommuniGate Pro SMTP 6.0.2) with ESMTPS id 387971151 for user@cassandra.apache.org; Tue, 18 Jun 2013 03:22:25 -0500 Received: from MBX30.exg5.exghost.com ([169.254.1.69]) by HT03.exg5.exghost.com ([98.129.23.45]) with mapi; Tue, 18 Jun 2013 03:22:25 -0500 From: Ananth Gundabattula To: "user@cassandra.apache.org" Date: Tue, 18 Jun 2013 03:22:22 -0500 Subject: Re: What is the effect of reducing the thrift message sizes on GC Thread-Topic: What is the effect of reducing the thrift message sizes on GC Thread-Index: Ac5r/PXYlmsDK35YQrm9g1yPs4c56Q== Message-ID: In-Reply-To: <155143A8-17F5-47AD-B0C4-EED4B0102085@thelastpickle.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.4.130416 acceptlanguage: en-US Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Thanks Aaron for the insight. One quick question: >The buffers are not pre allocated, but once they are allocated they are >not returned. So it's only an issue if have lots of clients connecting >and reading a lot of data. So to understand you correctly, the buffer is allocated per client connection and remains all the while during the JVM and is reused for each request ?=20 If that is the case, then I am presuming there is no much gain by playing around with this config with respect to optimizing for Gcs. >reduce bloom filters, index intervals =8A. Well we have tried all the configs as advised below (and others like key cache sizes etc ) and hit a dead end and that is the reason for a 1.2.4 move. Thanks for all your thoughts and advice on this. Regards, Ananth=20 On 6/18/13 5:56 PM, "aaron morton" wrote: >> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb* >This control the max size of a bugger allocated by thrift when processing >requests / responses. The buffers are not pre allocated, but once they >are allocated they are not returned. So it's only an issue if have lots >of clients connecting and reading a lot of data. > >> Our system is a very short column (both in number of columns and data >>sizes >> ) tables but having millions/billions of rows in each column family. >If you have over 500 million rows per node you may be running into issues >with the bloom filters and index samples. > >This typically looks like the heap usage does not reduce after CMS >compaction has completed. > >Ensure the bloom_file_fp_chance on the CF's is set to 0.01 for size >tiered compaction and 0.1 for levelled compaction. If you need to change >it run nodetool upgradesstables > >Then consider increasing the index_interval in the yaml file, see the >comments.=20 > >Note that v 1.2 moves the bloom filters off heap, so if you upgrade to >1.2 it will probably resolve your issues. > >Cheers > >----------------- >Aaron Morton >Freelance Cassandra Consultant >New Zealand > >@aaronmorton >http://www.thelastpickle.com > >On 18/06/2013, at 7:30 PM, Ananth Gundabattula > wrote: > >> We are currently running on 1.1.10 and planning to migrate to a higher >> version 1.2.4. >>=20 >> The question pertains to tweaking all the knobs to reduce GC related >>issues >> ( we have been fighting a lot of really bad GC issues on 1.1.10 and met >>with little >> success all the way using 1.1.10) >>=20 >> Taking into consideration GC tuning is a black art, I was wondering if >>we >> can have some good effect on the GC by tweaking the following settings: >>=20 >> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb* >> * >> * >> Our system is a very short column (both in number of columns and data >>sizes >> ) tables but having millions/billions of rows in each column family. >>The typical >> number of columns in each column family is 4. The typical lookup >>involves >> specifying the row key and fetching one column most of the times. The >> writes are also similar except for one keyspace where the number of >>columns >> are 50 but very small data sizes per column. >>=20 >> Assuming we can tweak the config values : >> * >> * >> * > thrift_framed_transport_size_in_mb & * >> * > thrift_max_message_length_in_mb * >>=20 >> to lower values in the above context, I was wondering if it helps in >>the GC >> being invoked less if the thrift settings reflect our data model reads >>and writes ? >>=20 >> For example: What is the impact by reducing the above config values on >>the >> GC to say 1 mb rather than say 15 or 16 ? >>=20 >> Thanks a lot for your inputs and thoughts. >>=20 >>=20 >> Regards, >> Ananth >