Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cassandra-user@incubator.apache.org
Received-SPF: pass (nike.apache.org: local policy)
From: "Freeman, Tim" <tim.freeman@hp.com>
To: "cassandra-user@incubator.apache.org"
	<cassandra-user@incubator.apache.org>
Date: Thu, 3 Dec 2009 22:34:17 +0000
Subject: RE: Persistently increasing read latency
Thread-Topic: Persistently increasing read latency
Thread-Index: Acp0ZrkXQWLtEs47S7GDpBXrSyGqigAAIxqw
Message-ID: 
 <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C22D8@GVW0432EXB.americas.hpqcorp.net>
References: 
 <59DD1BA8FD3C0F4C90771C18F2B5B53A4C850190C6@GVW0432EXB.americas.hpqcorp.net>
 	<e06563880912011110y354e1338mc8a255934e8263f4@mail.gmail.com>
 	<59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C20C5@GVW0432EXB.americas.hpqcorp.net>
 	<e06563880912031102n1d99ee5aoe666433b797d0a5f@mail.gmail.com>
 	<59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C20F4@GVW0432EXB.americas.hpqcorp.net>
 	<e06563880912031114y13cf6859n4e95eb59873cc82b@mail.gmail.com>
 	<59DD1BA8FD3C0F4C90771C18F2B5B53A4C850C228B@GVW0432EXB.americas.hpqcorp.net>
 <e06563880912031418r102ddf82w83733281675722c0@mail.gmail.com>
In-Reply-To: <e06563880912031418r102ddf82w83733281675722c0@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

>Can you tell if the system is i/o or cpu bound during compaction?

It's I/O bound.  It's using ~9% of 1 of 4 cores as I watch it, and all it's=
 doing right now is compactions.

Tim Freeman
Email: tim.freeman@hp.com
Desk in Palo Alto: (650) 857-2581
Home: (408) 774-1298
Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thur=
sday; call my desk instead.)


-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com]=20
Sent: Thursday, December 03, 2009 2:19 PM
To: cassandra-user@incubator.apache.org
Subject: Re: Persistently increasing read latency

On Thu, Dec 3, 2009 at 3:59 PM, Freeman, Tim <tim.freeman@hp.com> wrote:
> I stopped the client at 11:28. =A0There were 2306 files in data/Keyspace1=
. =A0It's now 12:44, and there are 1826 files in data/Keyspace1. =A0As I wr=
ote this email, the number increased to 1903, then to 1938 and 2015, even t=
hough the server has no clients. =A0I used jconsole to invoke a few explici=
t garbage collections and the number went down to 811.

Sounds normal.

> jconsole reports that the compaction pool has 1670 pending tasks. =A0As I=
 wrote this email, the number gradually increased to 1673. =A0The server ha=
s no clients, so this is odd. =A0The number of completed tasks in the compa=
ction pool has consistently been going up while the number of pending tasks=
 stays the same. =A0The number of completed tasks increased from 130 to 136=
.

This is because whenever compaction finishes, it adds another
compaction task to see if the newly compacted table is itself large
enough to compact with others.  In a system where compaction has kept
up with demand, these are quickly cleaned out of the queue, but in
your case they are stuck behind all the compactions that are merging
sstables.

So this is working as designed, but the design is poor because it
causes confusion.  If you can open a ticket for this that would be
great.

> log.2009-12-02-19: WARN [Timer-0] 2009-12-02 19:55:23,305 LoadDisseminato=
r.java (line 44) Exception was generated at : 12/02/2009 19:55:22 on thread=
 Timer-0

These have been fixed and are unrelated to compaction.

So, it sounds like things are working, and if you leave it alone for a
while it will finish compacting everything and the queue of compaction
jobs will clear out, and reads should be fast(er) again.

Like I said originally, increaing memtable size / object count will
reduce the number of compactions requred.  That's about all you can do
in 0.5...  Can you tell if the system is i/o or cpu bound during
compaction?

-Jonathan