incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Does variation in no of columns in rows over the column family has any performance impact ?
Date Tue, 08 Feb 2011 20:07:06 GMT
For completeness there are a couple of things in the config file that may be interesting if
you run into issues.

- column_index_size_in_kb defines how big a row has to get before an index is written for
the row. Without an index the entire row must be read to find a column. 

- in_memory_compaction_limit_in_mb - defines the maximum size of row than can be compacted
in memory, larger rows go through a slower compaction process.

- sliced_buffer_size_in_kb controls the size of the buffer when slicing columns. 

 Aaron
 
On 08 Feb, 2011,at 08:03 AM, Daniel Doubleday <daniel.doubleday@gmx.net> wrote:

It depends a little on your write pattern:

- Wide rows tend to get distributed over more sstables so more disk reads are necessary. This
will become noticeable when you have high io load and reads actually hit the discs.
- If you delete a lot slice query performance might suffer: extreme example: create 2M cols,
delete the first 1M and then ask for the first 10.


On Feb 7, 2011, at 7:07 AM, Aditya Narayan wrote:

> Does huge variation in no. of columns in rows, over the column family
> has *any* impact on the performance ?
> 
> Can I have like just 100 columns in some rows and like hundred
> thousands of columns in another set of rows, without any downsides ?


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
    • Unnamed multipart/related (inline, None, 0 bytes)
View raw message