cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3122) SSTableSimpleUnsortedWriter take long time when inserting big rows
Date Fri, 02 Sep 2011 16:07:10 GMT


Sylvain Lebresne commented on CASSANDRA-3122:

bq. I don't understand how the changes to writeRow work without doing anything to cF b/s asking
for its serializedSize

In (the new method) getColumnFamily, when we reuse a previous column family to add new columns
to it, we start by removing it's size from the estimate, so that when writeRow is called on
the updated cf, by adding the whole size we should still have a good estimate (actually a
better one that before because we don't count the row key multiple times anymore).

> SSTableSimpleUnsortedWriter take long time when inserting big rows
> ------------------------------------------------------------------
>                 Key: CASSANDRA-3122
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Benoit Perroud
>            Priority: Minor
>             Fix For: 0.8.5
>         Attachments: 3122.patch, SSTableSimpleUnsortedWriter-v2.patch, SSTableSimpleUnsortedWriter.patch
> In SSTableSimpleUnsortedWriter, when dealing with rows having a lot of columns, if we
call newRow several times (to flush data as soon as possible), the time taken by the newRow()
call is increasing non linearly. This is because when newRow is called, we merge the size
increasing existing CF with the new one.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message