For completeness there are a couple of things in the config file that may be interesting if you run into issues.
- column_index_size_in_kb defines how big a row has to get before an index is written for the row. Without an index the entire row must be read to find a column.
- in_memory_compaction_limit_in_mb - defines the maximum size of row than can be compacted in memory, larger rows go through a slower compaction process.
- sliced_buffer_size_in_kb controls the size of the buffer when slicing columns.
On 08 Feb, 2011,at 08:03 AM, Daniel Doubleday <firstname.lastname@example.org> wrote:
It depends a little on your write pattern:
- Wide rows tend to get distributed over more sstables so more disk reads are necessary. This will become noticeable when you have high io load and reads actually hit the discs.
- If you delete a lot slice query performance might suffer: extreme example: create 2M cols, delete the first 1M and then ask for the first 10.
On Feb 7, 2011, at 7:07 AM, Aditya Narayan wrote:
> Does huge variation in no. of columns in rows, over the column family
> has *any* impact on the performance ?
> Can I have like just 100 columns in some rows and like hundred
> thousands of columns in another set of rows, without any downsides ?