incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'
Date Wed, 10 Mar 2010 12:13:54 GMT
Well, I've found the reason.
The default cassandra configuration use a 10% row cache.
And the row cache reads all the row each time. So it was indeed reading the
full row each time even though the request was asking for only one column.

My bad (at least I learned something).

--
Sylvain

On Tue, Mar 9, 2010 at 9:49 PM, Brandon Williams <driftx@gmail.com> wrote:
> On Tue, Mar 9, 2010 at 2:28 PM, Sylvain Lebresne <sylvain@yakaz.com> wrote:
>>
>> > A row causes a disk seek while columns are contiguous.  So if the row
>> > isn't
>> > in the cache, you're being impaired by the seeks.  In general, fatter
>> > rows
>> > should be more performant than skinny ones.
>>
>> Sure, I understand that. Still, I get 400 columns by seconds (ie, 400
>> seeks by
>> seconds) when the rows only have one column by row, while I have 10
>> columns
>> by seconds when the row have 100 columns, even though I read only the
>> first
>> column.
>
> Doesn't that imply the disk is having to seek further for the rows with more
> columns?
> -Brandon

Mime
View raw message