incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Persistently increasing read latency
Date Fri, 04 Dec 2009 19:18:34 GMT
Fundamentally there's only so much I/O you can do at a time.  If you
don't have enough, you need to upgrade to servers with better i/o
(i.e. not EC2: http://pl.atyp.us/wordpress/?p=2240&cpage=1) and/or
more ram to cache the reads against.

On Fri, Dec 4, 2009 at 1:07 PM, B. Todd Burruss <bburruss@real.com> wrote:
> this is very concerning to me.  it doesn't seem to take much to bring
> the read performance to an unacceptable level.  are there any
> suggestions about how to improve performance.
>
> here are the params from my config file that are not defaults.  i
> adjusted these to get real good performance, but not over the long haul.
> has anyone had any luck adjusting these to help the problem tim and I
> are having?
>
> <CommitLogRotationThresholdInMB>256</CommitLogRotationThresholdInMB>
> <MemtableSizeInMB>1024</MemtableSizeInMB>
> <MemtableObjectCountInMillions>0.6</MemtableObjectCountInMillions>
> <CommitLogSyncPeriodInMS>1000</CommitLogSyncPeriodInMS>
> <MemtableFlushAfterMinutes>1440</MemtableFlushAfterMinutes>
>
>
> thx!
>
> On Fri, 2009-12-04 at 18:49 +0000, Freeman, Tim wrote:
>> The speed of compaction isn't the problem.  The problem is that lots of reads and
writes cause compaction to fall behind.
>>
>> You could solve the problem by throttling reads and writes so compaction isn't starved.
 (Maybe just the writes.  I'm not sure.)
>>
>> Different nodes will have different compaction backlogs, so you'd want to do this
on a per node basis after Cassandra has made decisions about whatever replication it's going
to do.  For example, Cassandra could observe the number of pending compaction tasks and sleep
that many milliseconds before every read and write.
>>
>> The status quo is that I have to count a load test as passing only if the amount
of backlogged compaction work stays less than some bound.  I'd rather not have to peer into
Cassandra internals to determine whether it's really working or not.  It's a problem if 16
hour load tests get different results than 1 hour load tests because in my tests I'm renting
a cluster by the hour.
>>
>> Tim Freeman
>> Email: tim.freeman@hp.com
>> Desk in Palo Alto: (650) 857-2581
>> Home: (408) 774-1298
>> Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday;
call my desk instead.)
>>
>> -----Original Message-----
>> From: Jonathan Ellis [mailto:jbellis@gmail.com]
>> Sent: Thursday, December 03, 2009 3:06 PM
>> To: cassandra-user@incubator.apache.org
>> Subject: Re: Persistently increasing read latency
>>
>> Thanks for looking into this.  Doesn't seem like there's much
>> low-hanging fruit to make compaction faster but I'll keep that in the
>> back of my mind.
>>
>> -Jonathan
>>
>> On Thu, Dec 3, 2009 at 4:58 PM, Freeman, Tim <tim.freeman@hp.com> wrote:
>> >>So this is working as designed, but the design is poor because it
>> >>causes confusion.  If you can open a ticket for this that would be
>> >>great.
>> >
>> > Done, see:
>> >
>> >   https://issues.apache.org/jira/browse/CASSANDRA-599
>> >
>> >>What does iostat -x 10 (for instance) say about the disk activity?
>> >
>> > rkB/s is consistently high, and wkB/s varies.  This is a typical entry with
wkB/s at the high end of its range:
>> >
>> >>avg-cpu:  %user   %nice    %sys %iowait   %idle
>> >>           1.52    0.00    1.70   27.49   69.28
>> >>
>> >>Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s  
 wkB/s avgrq-sz avgqu-sz   await  svctm  %util
>> >>sda          3.10 3249.25 124.08 29.67 26299.30 26288.11 13149.65 13144.06
  342.04    17.75   92.25   5.98  91.92
>> >>sda1         0.00   0.00  0.00  0.00    0.00    0.00     0.00
    0.00     0.00     0.00    0.00   0.00   0.00
>> >>sda2         3.10 3249.25 124.08 29.67 26299.30 26288.11 13149.65 13144.06
  342.04    17.75   92.25   5.98  91.92
>> >>sda3         0.00   0.00  0.00  0.00    0.00    0.00     0.00
    0.00     0.00     0.00    0.00   0.00   0.00
>> >
>> > and at the low end:
>> >
>> >>avg-cpu:  %user   %nice    %sys %iowait   %idle
>> >>           1.50    0.00    1.77   25.80   70.93
>> >>
>> >>Device:    rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s    rkB/s  
 wkB/s avgrq-sz avgqu-sz   await  svctm  %util
>> >>sda          3.40 817.10 128.60 17.70 27828.80 6600.00 13914.40  3300.00
  235.33     6.13   56.63   6.21  90.81
>> >>sda1         0.00   0.00  0.00  0.00    0.00    0.00     0.00
    0.00     0.00     0.00    0.00   0.00   0.00
>> >>sda2         3.40 817.10 128.60 17.70 27828.80 6600.00 13914.40  3300.00
  235.33     6.13   56.63   6.21  90.81
>> >>sda3         0.00   0.00  0.00  0.00    0.00    0.00     0.00
    0.00     0.00     0.00    0.00   0.00   0.00
>> >
>> > Tim Freeman
>> > Email: tim.freeman@hp.com
>> > Desk in Palo Alto: (650) 857-2581
>> > Home: (408) 774-1298
>> > Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday;
call my desk instead.)
>> >
>> >
>> > -----Original Message-----
>> > From: Jonathan Ellis [mailto:jbellis@gmail.com]
>> > Sent: Thursday, December 03, 2009 2:45 PM
>> > To: cassandra-user@incubator.apache.org
>> > Subject: Re: Persistently increasing read latency
>> >
>> > On Thu, Dec 3, 2009 at 4:34 PM, Freeman, Tim <tim.freeman@hp.com> wrote:
>> >>>Can you tell if the system is i/o or cpu bound during compaction?
>> >>
>> >> It's I/O bound.  It's using ~9% of 1 of 4 cores as I watch it, and all
it's doing right now is compactions.
>> >
>> > What does iostat -x 10 (for instance) say about the disk activity?
>> >
>
>
>

Mime
View raw message