incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yiming Sun <yiming....@gmail.com>
Subject Re: strange row cache behavior
Date Tue, 04 Dec 2012 20:34:27 GMT
Got it.  Thanks again, Aaron.

-- Y.


On Tue, Dec 4, 2012 at 3:07 PM, aaron morton <aaron@thelastpickle.com>wrote:

>  Does this mean we should not enable row caches until we are absolutely
> sure about what's hot (I think there is a reason why row caches are
> disabled by default) ?
>
> Yes and Yes.
> Row cache takes memory and CPU, unless you know you are getting a benefit
> from it leave it off. The key cache and os disk cache will help. If you
> find latency is an issue then start poking around.
>
> Cheers
>
>    -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 5/12/2012, at 4:23 AM, Yiming Sun <yiming.sun@gmail.com> wrote:
>
> Hi Aaron,
>
> Thank you,and your explanation makes sense.  At the time, I thought having
> 1GB of row cache on each node was plenty enough, because there was an
> aggregated 6GB cache, but you are right, with each row in 10's of MBs, some
> of the nodes can go into a constant load and evict cycle and would have
> negative effects on the performance.  I will try as you suggested to 1.)
> reduce the requested entry set, and 2.) increase the row cache size and see
> if they get better hits, and also do 3) by reversing the requested entry
> list in alternate runs.
>
> Our data space has close to 3 million rows, but we haven't gotten enough
> usage statistics to know what rows are hot.  Does this mean we should not
> enable row caches until we are absolutely sure about what's hot (I think
> there is a reason why row caches are disabled by default) ?  It also seems
> from my test that OS page cache works much better, but it could be that OS
> page cache can utilize all the available memory so it is essentially larger
> -- I guess I will find out by doing 2.) above.
>
> best,
>
> -- Y.
>
>
>
> On Tue, Dec 4, 2012 at 4:47 AM, aaron morton <aaron@thelastpickle.com>wrote:
>
>> > Row Cache        : size 1072651974 (bytes), capacity 1073741824
>> (bytes), 0 hits, 2576 requests, NaN recent hit rate, 0 save period in
>> seconds
>>
>> So the cache is pretty much full, there is only 1 MB free.
>>
>> There were 2,576 read requests that tried to get a row from the cache.
>> Zero of those had a hit. If you have 6 nodes and RF 2, each node has  one
>> third of the data in the cluster (from the effective ownership info). So
>> depending on the read workload the number of read requests on each node may
>> be different.
>>
>> What I think is happening is reads are populating the row cache, then
>> subsequent reads are evicting items from the row cache before you get back
>> to reading the original rows. So if you read rows 1 to 5, they are put in
>> the cache, when you read rows 6 to 10 they are put in and evict rows 1 to
>> 5. Then you read rows 1 to 5 again they are not in the cache.
>>
>> Try testing with a lower number of hot rows, and/or a bigger row cache.
>>
>> But to be honest, with rows in the 10's of MB you will probably only get
>> good cache performance with a small set of hot rows.
>>
>> Hope that helps.
>>
>>
>>
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 1/12/2012, at 5:11 AM, Yiming Sun <yiming.sun@gmail.com> wrote:
>>
>> > Does anyone have any comments/suggestions for me regarding this?  Thanks
>> >
>> >
>> > I am trying to understand some strange behavior of cassandra row cache.
>>  We have a 6-node Cassandra cluster in a single data center on 2 racks, and
>> the neighboring nodes on the ring are from alternative racks.  Each node
>> has 1GB row cache, with key cache disabled.   The cluster uses
>> PropertyFileSnitch, and the ColumnFamily I fetch from uses
>> NetworkTopologyStrategy, with replication factor of 2.  My client code uses
>> Hector to fetch a fixed set of rows from cassandra
>> >
>> > What I don't quite understand is even after I ran the client code
>> several times, there are always some nodes with 0 row cache hits, despite
>> that the row cache from all nodes are filled and all nodes receive requests.
>> >
>> > Which nodes have 0 hits seem to be strongly related to the following:
>> >
>> >  - the set of row keys to fetch
>> >  - the order of the set of row keys to fetch
>> >  - the list of hosts passed to Hector's CassandraHostConfigurator
>> >  - the order of the list of hosts passed to Hector
>> >
>> > Can someone shed some lights on how exactly the row cache works and
>> hopefully also explain the behavior I have been seeing?  I thought if the
>> fixed set of the rows keys are the only thing I am fetching (each row
>> should be on the order of 10's of MBs, no more than 100MB), and each node
>> gets requests, and its row cache is filled, there's gotta be some hits.
>>  Apparent this is not the case.   Thanks.
>> >
>> > cluster information:
>> >
>> > Address         DC          Rack        Status State   Load
>>  Effective-Ownership Token
>> >
>>                    141784319550391026443072753096570088105
>> > x.x.x.1    DC1         r1          Up     Normal  587.46 GB
>> 33.33%              0
>> > x.x.x.2    DC1         r2          Up     Normal  591.21 GB
>> 33.33%              28356863910078205288614550619314017621
>> > x.x.x.3    DC1         r1          Up     Normal  594.97 GB
>> 33.33%              56713727820156410577229101238628035242
>> > x.x.x.4    DC1         r2          Up     Normal  587.15 GB
>> 33.33%              85070591730234615865843651857942052863
>> > x.x.x.5    DC1         r1          Up     Normal  590.26 GB
>> 33.33%              113427455640312821154458202477256070484
>> > x.x.x.6    DC1         r2          Up     Normal  583.21 GB
>> 33.33%              141784319550391026443072753096570088105
>> >
>> >
>> > [user@node]$ ./checkinfo.sh
>> > *************** x.x.x.4
>> > Token            : 85070591730234615865843651857942052863
>> > Gossip active    : true
>> > Thrift active    : true
>> > Load             : 587.15 GB
>> > Generation No    : 1354074048
>> > Uptime (seconds) : 36957
>> > Heap Memory (MB) : 2027.29 / 3948.00
>> > Data Center      : DC1
>> > Rack             : r2
>> > Exceptions       : 0
>> >
>> > Key Cache        : size 0 (bytes), capacity 0 (bytes), 0 hits, 0
>> requests, NaN recent hit rate, 14400 save period in seconds
>> > Row Cache        : size 1072651974 (bytes), capacity 1073741824
>> (bytes), 0 hits, 2576 requests, NaN recent hit rate, 0 save period in
>> seconds
>> >
>> > *************** x.x.x.6
>> > Token            : 141784319550391026443072753096570088105
>> > Gossip active    : true
>> > Thrift active    : true
>> > Load             : 583.21 GB
>> > Generation No    : 1354074461
>> > Uptime (seconds) : 36535
>> > Heap Memory (MB) : 828.71 / 3948.00
>> > Data Center      : DC1
>> > Rack             : r2
>> > Exceptions       : 0
>> >
>> > Key Cache        : size 0 (bytes), capacity 0 (bytes), 0 hits, 0
>> requests, NaN recent hit rate, 14400 save period in seconds
>> > Row Cache        : size 1072602906 (bytes), capacity 1073741824
>> (bytes), 0 hits, 3194 requests, NaN recent hit rate, 0 save period in
>> seconds
>> >
>> >
>>
>>
>
>

Mime
View raw message