Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 6548 invoked from network); 4 Dec 2009 18:50:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Dec 2009 18:50:58 -0000 Received: (qmail 24556 invoked by uid 500); 4 Dec 2009 18:50:57 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 24538 invoked by uid 500); 4 Dec 2009 18:50:57 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 24528 invoked by uid 99); 4 Dec 2009 18:50:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2009 18:50:56 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [15.193.32.62] (HELO g6t0185.atlanta.hp.com) (15.193.32.62) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2009 18:50:46 +0000 Received: from G6W0641.americas.hpqcorp.net (g6w0641.atlanta.hp.com [16.230.34.77]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by g6t0185.atlanta.hp.com (Postfix) with ESMTPS id 3022B24024 for ; Fri, 4 Dec 2009 18:50:24 +0000 (UTC) Received: from G3W0629.americas.hpqcorp.net (16.233.58.78) by G6W0641.americas.hpqcorp.net (16.230.34.77) with Microsoft SMTP Server (TLS) id 8.2.176.0; Fri, 4 Dec 2009 18:49:29 +0000 Received: from GVW0432EXB.americas.hpqcorp.net ([16.234.32.146]) by G3W0629.americas.hpqcorp.net ([16.233.58.78]) with mapi; Fri, 4 Dec 2009 18:49:28 +0000 From: "Freeman, Tim" To: "cassandra-user@incubator.apache.org" Date: Fri, 4 Dec 2009 18:49:25 +0000 Subject: RE: Persistently increasing read latency Thread-Topic: Persistently increasing read latency Thread-Index: Acp0bUcU3e5O8Iw7QvKjl0UtYGz1cwApB9ZQ Message-ID: <59DD1BA8FD3C0F4C90771C18F2B5B53A4C85126629@GVW0432EXB.americas.hpqcorp.net> References: <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850190C6@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C20C5@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C20F4@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C228B@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C22D8@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C2317@GVW0432EXB.americas.hpqcorp.net> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org The speed of compaction isn't the problem. The problem is that lots of rea= ds and writes cause compaction to fall behind. You could solve the problem by throttling reads and writes so compaction is= n't starved. (Maybe just the writes. I'm not sure.) Different nodes will have different compaction backlogs, so you'd want to d= o this on a per node basis after Cassandra has made decisions about whateve= r replication it's going to do. For example, Cassandra could observe the n= umber of pending compaction tasks and sleep that many milliseconds before e= very read and write. The status quo is that I have to count a load test as passing only if the a= mount of backlogged compaction work stays less than some bound. I'd rather= not have to peer into Cassandra internals to determine whether it's really= working or not. It's a problem if 16 hour load tests get different result= s than 1 hour load tests because in my tests I'm renting a cluster by the h= our. Tim Freeman Email: tim.freeman@hp.com Desk in Palo Alto: (650) 857-2581 Home: (408) 774-1298 Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thur= sday; call my desk instead.) -----Original Message----- From: Jonathan Ellis [mailto:jbellis@gmail.com]=20 Sent: Thursday, December 03, 2009 3:06 PM To: cassandra-user@incubator.apache.org Subject: Re: Persistently increasing read latency Thanks for looking into this. Doesn't seem like there's much low-hanging fruit to make compaction faster but I'll keep that in the back of my mind. -Jonathan On Thu, Dec 3, 2009 at 4:58 PM, Freeman, Tim wrote: >>So this is working as designed, but the design is poor because it >>causes confusion. =A0If you can open a ticket for this that would be >>great. > > Done, see: > > =A0 https://issues.apache.org/jira/browse/CASSANDRA-599 > >>What does iostat -x 10 (for instance) say about the disk activity? > > rkB/s is consistently high, and wkB/s varies. =A0This is a typical entry = with wkB/s at the high end of its range: > >>avg-cpu: =A0%user =A0 %nice =A0 =A0%sys %iowait =A0 %idle >> =A0 =A0 =A0 =A0 =A0 1.52 =A0 =A00.00 =A0 =A01.70 =A0 27.49 =A0 69.28 >> >>Device: =A0 =A0rrqm/s wrqm/s =A0 r/s =A0 w/s =A0rsec/s =A0wsec/s =A0 =A0r= kB/s =A0 =A0wkB/s avgrq-sz avgqu-sz =A0 await =A0svctm =A0%util >>sda =A0 =A0 =A0 =A0 =A03.10 3249.25 124.08 29.67 26299.30 26288.11 13149.= 65 13144.06 =A0 342.04 =A0 =A017.75 =A0 92.25 =A0 5.98 =A091.92 >>sda1 =A0 =A0 =A0 =A0 0.00 =A0 0.00 =A00.00 =A00.00 =A0 =A00.00 =A0 =A00.0= 0 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A00.00 =A0 0.00 = =A0 0.00 >>sda2 =A0 =A0 =A0 =A0 3.10 3249.25 124.08 29.67 26299.30 26288.11 13149.65= 13144.06 =A0 342.04 =A0 =A017.75 =A0 92.25 =A0 5.98 =A091.92 >>sda3 =A0 =A0 =A0 =A0 0.00 =A0 0.00 =A00.00 =A00.00 =A0 =A00.00 =A0 =A00.0= 0 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A00.00 =A0 0.00 = =A0 0.00 > > and at the low end: > >>avg-cpu: =A0%user =A0 %nice =A0 =A0%sys %iowait =A0 %idle >> =A0 =A0 =A0 =A0 =A0 1.50 =A0 =A00.00 =A0 =A01.77 =A0 25.80 =A0 70.93 >> >>Device: =A0 =A0rrqm/s wrqm/s =A0 r/s =A0 w/s =A0rsec/s =A0wsec/s =A0 =A0r= kB/s =A0 =A0wkB/s avgrq-sz avgqu-sz =A0 await =A0svctm =A0%util >>sda =A0 =A0 =A0 =A0 =A03.40 817.10 128.60 17.70 27828.80 6600.00 13914.40= =A03300.00 =A0 235.33 =A0 =A0 6.13 =A0 56.63 =A0 6.21 =A090.81 >>sda1 =A0 =A0 =A0 =A0 0.00 =A0 0.00 =A00.00 =A00.00 =A0 =A00.00 =A0 =A00.0= 0 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A00.00 =A0 0.00 = =A0 0.00 >>sda2 =A0 =A0 =A0 =A0 3.40 817.10 128.60 17.70 27828.80 6600.00 13914.40 = =A03300.00 =A0 235.33 =A0 =A0 6.13 =A0 56.63 =A0 6.21 =A090.81 >>sda3 =A0 =A0 =A0 =A0 0.00 =A0 0.00 =A00.00 =A00.00 =A0 =A00.00 =A0 =A00.0= 0 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A0 0.00 =A0 =A00.00 =A0 0.00 = =A0 0.00 > > Tim Freeman > Email: tim.freeman@hp.com > Desk in Palo Alto: (650) 857-2581 > Home: (408) 774-1298 > Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Th= ursday; call my desk instead.) > > > -----Original Message----- > From: Jonathan Ellis [mailto:jbellis@gmail.com] > Sent: Thursday, December 03, 2009 2:45 PM > To: cassandra-user@incubator.apache.org > Subject: Re: Persistently increasing read latency > > On Thu, Dec 3, 2009 at 4:34 PM, Freeman, Tim wrote: >>>Can you tell if the system is i/o or cpu bound during compaction? >> >> It's I/O bound. =A0It's using ~9% of 1 of 4 cores as I watch it, and all= it's doing right now is compactions. > > What does iostat -x 10 (for instance) say about the disk activity? >