Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 74350 invoked from network); 3 Dec 2009 22:35:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Dec 2009 22:35:28 -0000 Received: (qmail 5442 invoked by uid 500); 3 Dec 2009 22:35:28 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 5404 invoked by uid 500); 3 Dec 2009 22:35:28 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 5395 invoked by uid 99); 3 Dec 2009 22:35:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Dec 2009 22:35:28 +0000 X-ASF-Spam-Status: No, hits=-4.0 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [15.216.28.36] (HELO g1t0029.austin.hp.com) (15.216.28.36) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Dec 2009 22:35:17 +0000 Received: from G3W0631.americas.hpqcorp.net (g3w0631.americas.hpqcorp.net [16.233.59.15]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by g1t0029.austin.hp.com (Postfix) with ESMTPS id 7F1DD38123 for ; Thu, 3 Dec 2009 22:34:55 +0000 (UTC) Received: from G4W0659.americas.hpqcorp.net (16.234.40.187) by G3W0631.americas.hpqcorp.net (16.233.59.15) with Microsoft SMTP Server (TLS) id 8.2.176.0; Thu, 3 Dec 2009 22:34:18 +0000 Received: from GVW0432EXB.americas.hpqcorp.net ([16.234.32.146]) by G4W0659.americas.hpqcorp.net ([16.234.40.187]) with mapi; Thu, 3 Dec 2009 22:34:18 +0000 From: "Freeman, Tim" To: "cassandra-user@incubator.apache.org" Date: Thu, 3 Dec 2009 22:34:17 +0000 Subject: RE: Persistently increasing read latency Thread-Topic: Persistently increasing read latency Thread-Index: Acp0ZrkXQWLtEs47S7GDpBXrSyGqigAAIxqw Message-ID: <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C22D8@GVW0432EXB.americas.hpqcorp.net> References: <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850190C6@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C20C5@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C20F4@GVW0432EXB.americas.hpqcorp.net> <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C228B@GVW0432EXB.americas.hpqcorp.net> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org >Can you tell if the system is i/o or cpu bound during compaction? It's I/O bound. It's using ~9% of 1 of 4 cores as I watch it, and all it's= doing right now is compactions. Tim Freeman Email: tim.freeman@hp.com Desk in Palo Alto: (650) 857-2581 Home: (408) 774-1298 Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thur= sday; call my desk instead.) -----Original Message----- From: Jonathan Ellis [mailto:jbellis@gmail.com]=20 Sent: Thursday, December 03, 2009 2:19 PM To: cassandra-user@incubator.apache.org Subject: Re: Persistently increasing read latency On Thu, Dec 3, 2009 at 3:59 PM, Freeman, Tim wrote: > I stopped the client at 11:28. =A0There were 2306 files in data/Keyspace1= . =A0It's now 12:44, and there are 1826 files in data/Keyspace1. =A0As I wr= ote this email, the number increased to 1903, then to 1938 and 2015, even t= hough the server has no clients. =A0I used jconsole to invoke a few explici= t garbage collections and the number went down to 811. Sounds normal. > jconsole reports that the compaction pool has 1670 pending tasks. =A0As I= wrote this email, the number gradually increased to 1673. =A0The server ha= s no clients, so this is odd. =A0The number of completed tasks in the compa= ction pool has consistently been going up while the number of pending tasks= stays the same. =A0The number of completed tasks increased from 130 to 136= . This is because whenever compaction finishes, it adds another compaction task to see if the newly compacted table is itself large enough to compact with others. In a system where compaction has kept up with demand, these are quickly cleaned out of the queue, but in your case they are stuck behind all the compactions that are merging sstables. So this is working as designed, but the design is poor because it causes confusion. If you can open a ticket for this that would be great. > log.2009-12-02-19: WARN [Timer-0] 2009-12-02 19:55:23,305 LoadDisseminato= r.java (line 44) Exception was generated at : 12/02/2009 19:55:22 on thread= Timer-0 These have been fixed and are unrelated to compaction. So, it sounds like things are working, and if you leave it alone for a while it will finish compacting everything and the queue of compaction jobs will clear out, and reads should be fast(er) again. Like I said originally, increaing memtable size / object count will reduce the number of compactions requred. That's about all you can do in 0.5... Can you tell if the system is i/o or cpu bound during compaction? -Jonathan