From user-return-22001-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Thu Nov 3 23:01:54 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1604756C for ; Thu, 3 Nov 2011 23:01:54 +0000 (UTC) Received: (qmail 9111 invoked by uid 500); 3 Nov 2011 23:01:51 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 9074 invoked by uid 500); 3 Nov 2011 23:01:51 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 9066 invoked by uid 99); 3 Nov 2011 23:01:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2011 23:01:51 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dan.hendry.junk@gmail.com designates 209.85.213.44 as permitted sender) Received: from [209.85.213.44] (HELO mail-yw0-f44.google.com) (209.85.213.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Nov 2011 23:01:44 +0000 Received: by ywt2 with SMTP id 2so2103517ywt.31 for ; Thu, 03 Nov 2011 16:01:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:references:in-reply-to:subject:date:message-id:mime-version :content-type:x-mailer:thread-index:content-language; bh=dpa0nHJzyR4vZ8XjU24pZA8qyRQx/fgErSa7xpt1vHY=; b=hYJjZRbA3KP5J0uc70htGE9aphYzIEMqGiqF3i7JSuZ8TL6kHNa5BXaD4X7UzNAuB1 3t915CO/2Yo12ynIFKlMpbnIkH6K2/8OC4BcjwgT8x06WB2t6cv3CflpPpEHfYZIN5GF mskUpuoGs+Rw/N5BI85Xpj40L0r1Q+ERdkzCY= Received: by 10.236.175.4 with SMTP id y4mr15714969yhl.128.1320361283545; Thu, 03 Nov 2011 16:01:23 -0700 (PDT) Received: from DHTABLET ([216.16.242.198]) by mx.google.com with ESMTPS id j25sm11729624yhm.12.2011.11.03.16.01.21 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 03 Nov 2011 16:01:22 -0700 (PDT) From: "Dan Hendry" To: References: In-Reply-To: Subject: RE: Read perf investigation Date: Thu, 3 Nov 2011 19:01:02 -0400 Message-ID: <4eb31d42.a5b4ec0a.6ad6.7e05@mx.google.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_007E_01CC9A5A.EF5AA110" X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcyaeL3rBaYa9pTIS2SYwaqInxo7RQAAeIyA Content-Language: en-ca X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. ------=_NextPart_000_007E_01CC9A5A.EF5AA110 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Uh, so look at your await time of *107.3*. From the iostat man page: "await: The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them." If the key you are reading from is not in Cassandras key cache or row cache, Cassandra needs to do two disk seeks (http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra). This means that some of your *must* take on average 215 ms not even including network latency. Looks like EBS, or more generally disk saturation, is your problem. Perhaps consider RAID0 with ephemeral drives. Dan From: Ian Danforth [mailto:idanforth@numenta.com] Sent: November-03-11 18:34 To: user@cassandra.apache.org Subject: Read perf investigation All, I've done a bit more homework, and I continue to see long 200ms to 300ms read times for some keys. Test Setup EC2 M1Large sending requests to a 5 node C* cluster also in EC2, also all M1Large. RF=3. ReadConsistency = ONE. I'm using pycassa from python for all communication. Data Model One column family with tens of millions of rows. The number of columns per row varies between 0 and 1440 (per minute records). The values are all ints. All data stored on EBS volumes. Total load per node is ~110GB. According to VMstat I'm not swapping at all. Highest %Util I see Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util xvdf 0.00 2788.00 17.00 267.50 1168.00 23020.00 85.02 32.37 107.73 1.22 34.60 A more average profile I see is: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util xvdf 0.00 0.00 21.00 0.00 1288.00 0.00 61.33 0.37 18.38 9.43 19.80 QUESTION Where should I look next? I'd love to get a profile of exactly where cassandra is spending its time on a per call basis. Thanks in advance, Ian No virus found in this incoming message. Checked by AVG - www.avg.com Version: 9.0.920 / Virus Database: 271.1.1/3993 - Release Date: 11/03/11 03:39:00 ------=_NextPart_000_007E_01CC9A5A.EF5AA110 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Uh, so look at your await time of *107.3*. From the iostat man = page: “await: The average time (in milliseconds) for I/O requests = issued to the device to be  served.  This includes the time = spent by the requests in queue and the time spent servicing = them.”

 

If the key you are reading from is not in Cassandras key cache or row = cache, Cassandra needs to do two disk seeks = (http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra= ). This means that some of your *must* take on average 215 ms not = even including network latency. Looks like EBS, or more generally disk = saturation, is your problem. Perhaps consider RAID0 with ephemeral = drives.

 

Dan

 

From:= Ian = Danforth [mailto:idanforth@numenta.com]
Sent: November-03-11 = 18:34
To: user@cassandra.apache.org
Subject: Read = perf investigation

 

All,

 

 I've done a bit more homework, and I continue to = see long 200ms to 300ms read times for some = keys.

 

Test Setup

 

EC2 M1Large sending requests to a 5 node C* cluster = also in EC2, also all M1Large. RF=3D3. ReadConsistency =3D ONE. I'm = using pycassa from python for all = communication.

 

Data Model

 

One column family with tens of millions of rows. The = number of columns per row varies between 0 and 1440 (per minute = records). The values are all ints. All data stored on EBS volumes. Total = load per node is ~110GB.

 

According to VMstat I'm not swapping at = all.

 

Highest %Util I see

Device: =         rrqm/s   wrqm/s     r/s =     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   = await  svctm  %util

xvdf   =            0.00  2788.00   17.00 =  267.50  1168.00 23020.00    85.02   =  32.37  107.73   1.22 =  34.60

 

A more = average profile I see is:

 

Device: =         rrqm/s   wrqm/s     r/s =     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   = await  svctm  %util

xvdf   =            0.00     0.00   = 21.00    0.00  1288.00     0.00   =  61.33     0.37   18.38   9.43 =  19.80

 

QUESTION

 

Where should I look next? I'd love to get a profile of = exactly where cassandra is spending its time on a per call = basis.

 

Thanks in advance,

 

Ian

No virus = found in this incoming message.
Checked by AVG - = www.avg.com
Version: 9.0.920 / Virus Database: 271.1.1/3993 - Release = Date: 11/03/11 03:39:00

------=_NextPart_000_007E_01CC9A5A.EF5AA110--