cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avinash Lakshman <avinash.laksh...@gmail.com>
Subject Re: Reading thousands of columns
Date Wed, 14 Apr 2010 19:40:11 GMT
How large are the values? How much data on disk?

On Wednesday, April 14, 2010, James Golick <jamesgolick@gmail.com> wrote:
> Just for the record, I am able to repeat this locally.
> I'm seeing around 150ms to read 1000 columns from a row that has 3000 in it. If I enable
the rowcache, that goes down to about 90ms. According to my profile, 90% of the time is being
spent waiting for cassandra to respond, so it's not thrift.
>
> On Wed, Apr 14, 2010 at 11:01 AM, Paul Prescod <prescod@gmail.com> wrote:
>
> On Wed, Apr 14, 2010 at 10:31 AM, Mike Malone <mike@simplegeo.com> wrote:
>> ...
>>
>> Couldn't you cache a list of keys that were returned for the key range, then
>> cache individual rows separately or not at all?
>> By "blowing away rows queried by key" I'm guessing you mean "pushing them
>> out of the LRU cache," not explicitly blowing them away? Either way I'm not
>> entirely convinced. In my experience I've had pretty good success caching
>> items that were pulled out via more complicated join / range type queries.
>> If your system is doing lots of range quereis, and not a lot of lookups by
>> key, you'd obviously see a performance win from caching the range queries.
>> Maybe range scan caching could be turned on separately?
>
> I agree with you that the caches should be separate, if you're going
> to cache ranges. You could imagine a single query (perhaps entered
> interactively) would replace the entire row caching all of the data
> for the systems' interactive users. For example, a summary page of who
> is most over the last month active could replace the profile
> information for the actual users who are using the system at that
> moment.
>
>  Paul Prescod
>
>
>

Mime
View raw message