incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: Replicate On Write behavior
Date Thu, 01 Sep 2011 21:05:36 GMT
sorry i mean  cf * row

if you look in the code, db.cf  is just basically a set of columns
On Sep 1, 2011 1:36 PM, "Ian Danforth" <idanforth@numenta.com> wrote:
> I'm not sure I understand the scalability of this approach. A given
> column family can be HUGE with millions of rows and columns. In my
> cluster I have a single column family that accounts for 90GB of load
> on each node. Not only that but column family is distributed over the
> entire ring.
>
> Clearly I'm misunderstanding something.
>
> Ian
>
> On Thu, Sep 1, 2011 at 1:17 PM, Yang <teddyyyy123@gmail.com> wrote:
>> when Cassandra reads, the entire CF is always read together, only at the
>> hand-over to client does the pruning happens
>>
>> On Thu, Sep 1, 2011 at 11:52 AM, David Hawthorne <dhawth@gmx.3crowd.com>
>> wrote:
>>>
>>> I'm curious... digging through the source, it looks like replicate on
>>> write triggers a read of the entire row, and not just the
>>> columns/supercolumns that are affected by the counter update.  Is this
the
>>> case?  It would certainly explain why my inserts/sec decay over time and
why
>>> the average insert latency increases over time.  The strange thing is
that
>>> I'm not seeing disk read IO increase over that same period, but that
might
>>> be due to the OS buffer cache...
>>>
>>> On another note, on a 5-node cluster, I'm only seeing 3 nodes with
>>> ReplicateOnWrite Completed tasks in nodetool tpstats output.  Is that
>>> normal?  I'm using RandomPartitioner...
>>>
>>> Address         DC          Rack        Status State   Load
>>>  Owns    Token
>>>
>>>  136112946768375385385349842972707284580
>>> 10.0.0.57    datacenter1 rack1       Up     Normal  2.26 GB
20.00%
>>>  0
>>> 10.0.0.56    datacenter1 rack1       Up     Normal  2.47 GB
20.00%
>>>  34028236692093846346337460743176821145
>>> 10.0.0.55    datacenter1 rack1       Up     Normal  2.52 GB
20.00%
>>>  68056473384187692692674921486353642290
>>> 10.0.0.54    datacenter1 rack1       Up     Normal  950.97 MB
20.00%
>>>  102084710076281539039012382229530463435
>>> 10.0.0.72    datacenter1 rack1       Up     Normal  383.25 MB
20.00%
>>>  136112946768375385385349842972707284580
>>>
>>> The nodes with ReplicateOnWrites are the 3 in the middle.  The first
node
>>> and last node both have a count of 0.  This is a clean cluster, and I've
>>> been doing 3k ... 2.5k (decaying performance) inserts/sec for the last
12
>>> hours.  The last time this test ran, it went all the way down to 500
>>> inserts/sec before I killed it.
>>

Mime
View raw message