incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Performance issues with CQL3 collections?
Date Fri, 28 Jun 2013 05:04:26 GMT
Can you provide details of the mutation statements you are running ? The Stack Overflow posts
don't seem to include them. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 27/06/2013, at 5:58 AM, Theo Hultberg <theo@iconara.net> wrote:

> do I understand it correctly if I think that collection modifications are done by reading
the collection, writing a range tombstone that would cover the collection and then re-writing
the whole collection again? or is it just the modified parts of the collection that are covered
by the range tombstones, but you still get massive amounts of them and its just their number
that is the problem.
> 
> would this explain the slowdown of writes too? I guess it would if cassandra needed to
read the collection before it wrote the new values, otherwise I don't understand how this
affects writes, but that only says how much I know about how this works.
> 
> T#
> 
> 
> On Wed, Jun 26, 2013 at 10:48 AM, Fabien Rousseau <fabien@yakaz.com> wrote:
> Hi,
> 
> I'm pretty sure that it's related to this ticket : https://issues.apache.org/jira/browse/CASSANDRA-5677
> 
> I'd be happy if someone tests this patch.
> It should apply easily on 1.2.5 & 1.2.6
> 
> After applying the patch, by default, the current implementation is still used, but modify
your cassandra.yaml to add the following one : 
> interval_tree_provider: IntervalTreeAvlProvider
> 
> (Note that implementations should be interchangeable, because they share the same serializers
and deserializers)
> 
> Also, please note that this patch has not been reviewed nor intensively tested... So,
it may not be "production ready"
> 
> Fabien
> 
> 
> 
> 
> 
> 
> 
> 2013/6/26 Theo Hultberg <theo@iconara.net>
> Hi,
> 
> I've seen a couple of people on Stack Overflow having problems with performance when
they have maps that they continuously update, and in hindsight I think I might have run into
the same problem myself (but I didn't suspect it as the reason and designed differently and
by accident didn't use maps anymore).
> 
> Is there any reason that maps (or lists or sets) in particular would become a performance
issue when they're heavily modified? As I've understood them they're not special, and shouldn't
be any different performance wise than overwriting regular columns. Is there something different
going on that I'm missing?
> 
> Here are the Stack Overflow questions:
> 
> http://stackoverflow.com/questions/17282837/cassandra-insert-perfomance-issue-into-a-table-with-a-map-type/17290981
> 
> http://stackoverflow.com/questions/17082963/bad-performance-when-writing-log-data-to-cassandra-with-timeuuid-as-a-column-nam/17123236
> 
> yours,
> Theo
> 
> 
> 
> -- 
> Fabien Rousseau
> 
> 
> www.yakaz.com
> 


Mime
View raw message