Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 6250 invoked from network); 6 May 2010 17:56:36 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 May 2010 17:56:36 -0000 Received: (qmail 98599 invoked by uid 500); 6 May 2010 17:56:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 98574 invoked by uid 500); 6 May 2010 17:56:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 98566 invoked by uid 99); 6 May 2010 17:56:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 May 2010 17:56:35 +0000 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rantav@gmail.com designates 209.85.211.190 as permitted sender) Received: from [209.85.211.190] (HELO mail-yw0-f190.google.com) (209.85.211.190) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 May 2010 17:56:30 +0000 Received: by ywh28 with SMTP id 28so156985ywh.28 for ; Thu, 06 May 2010 10:56:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=hq9o4Yx2PV2RTnxDIgO/YcK3w9zAVAxlipaiYMp0guU=; b=aCZAn/2CF0gv1FcKszdFjunZ+XQhxsm7O4sjG219I5H1gY3T51134ofIzRcC5FK9rv qlL9VlfLg3QKwrdEfDcEFalmTr/JP8ltfQuASPlGansQS2OPPizxLAnl14A8dZ3LTGOW e/0NIO8LxAvdcHd7spw+EzumgGXh83Mqneljw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=PufZGeX0rVkeD2GJzLeou10xxL3FJ6btURZ45jmO7Fv/2SFCyP01QkuKOkc5zC6CmN 8lNMzq7vClPycDnCm3hX+7AQSP5TDX5Axw+ys7TWNVyr18BPxCgPmRyZC+gIlwCgKITx 9F9owTBJkKo1v+606zRZXyx0JBeVmf3VeY5Gs= MIME-Version: 1.0 Received: by 10.231.156.196 with SMTP id y4mr228291ibw.40.1273168568599; Thu, 06 May 2010 10:56:08 -0700 (PDT) Received: by 10.231.162.72 with HTTP; Thu, 6 May 2010 10:56:08 -0700 (PDT) In-Reply-To: References: <8A606DEA-CB57-4D0B-90C0-FE79B2DE22E9@discovereads.com> Date: Thu, 6 May 2010 20:56:08 +0300 Message-ID: Subject: Re: performance tuning - where does the slowness come from? From: Ran Tavory To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=00504501586f4545830485f0a8ad --00504501586f4545830485f0a8ad Content-Type: text/plain; charset=UTF-8 Jonathan, I think it's the case of large values in the columns. The problematic CF is a key-value store, so it has only one column per row, however the value of that column can be large. It's a java serialized object (uncompressed) which, may be 100s of bytes, maybe even a few megs. This CF also suffers from zero cache hits since each time a read is for a unique key. I ran stress.py and I see much better results (reads are < 1ms) so I assume my cluster is healthy, so I need to fix the app. Would 1meg bytes object explain a 30ms (sometimes even more) read latency? The boxes aren't fancy, not sure exactly what hardware we have there but it's "commodity"... Thanks! On Thu, May 6, 2010 at 5:22 PM, Jonathan Ellis wrote: > columns, not CFs. > > put another way, how wide are the rows in the slow CF? > > On Wed, May 5, 2010 at 11:30 PM, Ran Tavory wrote: > > I have a few CFs but the one I'm seeing slowness in, which is the one > with > > plenty of cache misses has only one column per key. > > Latency varies b/w 10m and 60ms but I'd say average is 30ms. > > > > On Thu, May 6, 2010 at 4:25 AM, Jonathan Ellis > wrote: > >> > >> How many columns are in the rows you are reading from? > >> > >> 30ms is quite high, so I suspect you have relatively large rows, in > >> which case decreasing the column index threshold may help. > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com > --00504501586f4545830485f0a8ad Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Jonathan, I think it's the case of large values in the= columns. The problematic CF is a key-value store, so it has only one colum= n per row, however the value of that column can be large. It's a java s= erialized object (uncompressed) which, may be 100s of bytes, maybe even a f= ew megs. This CF also suffers from zero cache hits since each time a read i= s for a unique key.=C2=A0

I ran stress.py and I see much better=C2=A0results=C2=A0(rea= ds are < 1ms) so I assume my cluster is=C2=A0healthy, so I need to fix t= he app. Would 1meg bytes object explain a 30ms (sometimes even more) read l= atency? The boxes aren't fancy, not sure exactly what hardware we have = there but it's "commodity"...

Thanks!

On Thu, May 6= , 2010 at 5:22 PM, Jonathan Ellis <jbellis@gmail.com> wrote:=
columns, not CFs.

put another way, how wide are the rows in the slow CF?

On Wed, May 5, 2010 at 11:30 PM, Ran Tavory <rantav@gmail.com> wrote:
> I have a few CFs but the one I'm seeing slowness in, which is the = one with
> plenty of cache misses has only one column per key.
> Latency varies b/w 10m and 60ms but I'd say average is 30ms.
>
> On Thu, May 6, 2010 at 4:25 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>
>> How many columns are in the rows you are reading from?
>>
>> 30ms is quite high, so I suspect you have relatively large rows, i= n
>> which case decreasing the column index threshold may help.

--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

--00504501586f4545830485f0a8ad--