incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hawthorne <dha...@gmx.3crowd.com>
Subject Re: ReplicateOnWrite issues
Date Tue, 12 Jul 2011 23:00:47 GMT
Thanks for looking at that.

Our use case involves supercolumns that have 2-20,000 counters within them.  For a set of
continuous updates to one supercolumn, the behavior you're describing is:

insert first counter into supercolumn
insert second counter into supercolumn
read entire supercolumn (now 2 wide)
insert third counter into supercolumn
read entire supercolumn (now 3 wide)
insert fourth counter into supercolumn
read entire supercolumn (now 4 wide)
...
insert 20,000th counter into supercolumn
read entire supercolumn for the 20,000th time (now 20,000 columns wide)

What happens if I turn replicate on write off and go to RF=3 on a multi-node cluster?

Unfortunately, I don't see a way to get the size of the largest supercolumn in the same way
you can get the size of the largest row, so I don't know what our max number of columns in
any supercolumn is.

The test I was running against the single-node cluster just died, here's a graph.  It held
steady at 2.5-3k inserts/sec for a while, and then cassandra became unresponsive to JMX requests
for a while (that's the sharp dip to 0 at 15:48), after which you can see the ReplicateOnWrite
Pending Tasks creep upwards of 1M when the max row size and max size of any CF on disk both
spike.  You can also see the total number of reads done increase sharply at the same time.
 All stats are absolute values with the exception of the inserts/sec, which was multiplied
by 10 so it would show up with everything else.  Inserts/sec are from the client's perspective,
not from cassandra's.  I can also tell you that the client is seeing a lot of hector timeout
exceptions and retry burden has been pushed back to client.


Mime
View raw message