incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hawthorne <dha...@gmx.3crowd.com>
Subject Re: Replicate On Write behavior
Date Fri, 02 Sep 2011 07:45:54 GMT
That's interesting.  I did an experiment wherein I added some entropy to the row name based
on the time when the increment came in, (e.g. row = row + "/" + (timestamp - (timestamp %
300))) and now not only is the load (in GB) on my cluster more balanced, the performance has
not decayed and has stayed steady (inserts/sec) with a relatively low average ms/insert. 
Each row is now significantly shorter as a result of this change.



On Sep 2, 2011, at 12:30 AM, Sylvain Lebresne wrote:

> On Thu, Sep 1, 2011 at 8:52 PM, David Hawthorne <dhawth@gmx.3crowd.com> wrote:
>> I'm curious... digging through the source, it looks like replicate on write triggers
a read of the entire row, and not just the columns/supercolumns that are affected by the counter
update.  Is this the case?  It would certainly explain why my inserts/sec decay over time
and why the average insert latency increases over time.  The strange thing is that I'm not
seeing disk read IO increase over that same period, but that might be due to the OS buffer
cache...
> 
> It does not. It only reads the columns/supercolumns affected by the
> counter update.
> In the source, this happens in CounterMutation.java. If you look at
> addReadCommandFromColumnFamily you'll see that it does a query by name
> only for the column involved in the update (the update is basically
> the content of the columnFamily parameter there).
> 
> And Cassandra does *not* always reads a full row. Never had, never will.
> 
>> On another note, on a 5-node cluster, I'm only seeing 3 nodes with ReplicateOnWrite
Completed tasks in nodetool tpstats output.  Is that normal?  I'm using RandomPartitioner...
>> 
>> Address         DC          Rack        Status State   Load            Owns    Token
>>                                                                            136112946768375385385349842972707284580
>> 10.0.0.57    datacenter1 rack1       Up     Normal  2.26 GB         20.00%  0
>> 10.0.0.56    datacenter1 rack1       Up     Normal  2.47 GB         20.00%  34028236692093846346337460743176821145
>> 10.0.0.55    datacenter1 rack1       Up     Normal  2.52 GB         20.00%  68056473384187692692674921486353642290
>> 10.0.0.54    datacenter1 rack1       Up     Normal  950.97 MB       20.00%  102084710076281539039012382229530463435
>> 10.0.0.72    datacenter1 rack1       Up     Normal  383.25 MB       20.00%  136112946768375385385349842972707284580
>> 
>> The nodes with ReplicateOnWrites are the 3 in the middle.  The first node and last
node both have a count of 0.  This is a clean cluster, and I've been doing 3k ... 2.5k (decaying
performance) inserts/sec for the last 12 hours.  The last time this test ran, it went all
the way down to 500 inserts/sec before I killed it.
> 
> Could be due to https://issues.apache.org/jira//browse/CASSANDRA-2890.
> 
> --
> Sylvain


Mime
View raw message