cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-3122) SSTableSimpleUnsortedWriter take long time when inserting big rows
Date Fri, 02 Sep 2011 15:08:09 GMT


Sylvain Lebresne updated CASSANDRA-3122:

    Attachment: 3122.patch

As said on the mailing list, though that solution does improve performance, I think we can
do better, by simply having the insertions go into the previous column family  when we "reopen
a row" instead of creating a new column family each time and copying everything to the previous
one afterwards. 

Attaching patch (3122.patch) that does just this. Note that this patch also fix a bug by which
the last row wasn't written and add a unit test for the UnsortedWriter.

> SSTableSimpleUnsortedWriter take long time when inserting big rows
> ------------------------------------------------------------------
>                 Key: CASSANDRA-3122
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Benoit Perroud
>            Priority: Minor
>             Fix For: 0.8.5
>         Attachments: 3122.patch, SSTableSimpleUnsortedWriter-v2.patch, SSTableSimpleUnsortedWriter.patch
> In SSTableSimpleUnsortedWriter, when dealing with rows having a lot of columns, if we
call newRow several times (to flush data as soon as possible), the time taken by the newRow()
call is increasing non linearly. This is because when newRow is called, we merge the size
increasing existing CF with the new one.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message