incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: What is the effect of reducing the thrift message sizes on GC
Date Tue, 18 Jun 2013 07:56:36 GMT
> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
This control the max size of a bugger allocated by thrift when processing requests / responses.
The buffers are not pre allocated, but once they are allocated they are not returned. So it's
only an issue if have lots of clients connecting and reading a lot of data. 

> Our system is a very short column (both in number of columns and data sizes
> ) tables but having millions/billions of rows in each column family.
If you have over 500 million rows per node you may be running into issues with the bloom filters
and index samples. 

This typically looks like the heap usage does not reduce after CMS compaction has completed.


Ensure the bloom_file_fp_chance on the CF's is set to 0.01 for size tiered compaction and
0.1 for levelled compaction. If you need to change it  run nodetool upgradesstables

Then consider increasing the index_interval in the yaml file, see the comments. 

Note that v 1.2 moves the bloom filters off heap, so if you upgrade to 1.2 it will probably
resolve your issues. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/06/2013, at 7:30 PM, Ananth Gundabattula <agundabattula@threatmetrix.com> wrote:

> We are currently running on 1.1.10 and planning to migrate to a higher
> version 1.2.4.
> 
> The question pertains to tweaking all the knobs to reduce GC related issues
> ( we have been fighting a lot of really bad GC issues on 1.1.10 and met with little
> success all the way using 1.1.10)
> 
> Taking into consideration GC tuning is a black art, I was wondering if we
> can have some good effect on the GC by tweaking the following settings:
> 
> *thrift_framed_transport_size_in_mb & thrift_max_message_length_in_mb*
> *
> *
> Our system is a very short column (both in number of columns and data sizes
> ) tables but having millions/billions of rows in each column family. The typical
> number of columns in each column family is 4. The typical lookup involves
> specifying the row key and fetching one column most of the times. The
> writes are also similar except for one keyspace where the number of columns
> are 50 but very small data sizes per column.
> 
> Assuming we can tweak the config values :
> *
> *
> * > thrift_framed_transport_size_in_mb & *
> * >  thrift_max_message_length_in_mb *
> 
> to lower values in the above context, I was wondering if it helps in the GC
> being invoked less if the thrift settings reflect our data model reads and writes ?
> 
> For example: What is the impact by reducing the above config values on the
> GC to say 1 mb rather than say 15 or 16 ?
> 
> Thanks a lot for your inputs and thoughts.
> 
> 
> Regards,
> Ananth


Mime
View raw message