Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 13026 invoked from network); 4 Feb 2011 01:23:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Feb 2011 01:23:42 -0000 Received: (qmail 62146 invoked by uid 500); 4 Feb 2011 01:23:40 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 62095 invoked by uid 500); 4 Feb 2011 01:23:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 62087 invoked by uid 99); 4 Feb 2011 01:23:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Feb 2011 01:23:39 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of driftx@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bw0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Feb 2011 01:23:32 +0000 Received: by bwz12 with SMTP id 12so2501949bwz.31 for ; Thu, 03 Feb 2011 17:23:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=7ZXOi16bi4WLltNhTtV/9U84X/xk8fwDq0d76LH6OOE=; b=Jux6dwkosnc1k1LXsfGCApZucSDoWPF556pS04xcgV7T+uz83F3turLxIJh8BJ7Kq1 aa7LCvwkSAEgiIoGX2YKp75JKBmCOS2sJccCKCWQfTL5koYSm6+ogef4zPoVkM5grhg4 cD1pVi6ZYGnGGPaXkso+OSZi6rlJVvJwxae/g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=QO+tQ2g6H6AXoUVxdKNlu8/rMwKtMVc5RPmZZApZBS6NJYxeIb7YXcy2w0SxONGnKZ 1Kyh1qG2EeT0emcD5WbFgKGL7ov1qqn4Rgx4lls620cjtoGjCazdJuiiF1uQ6ij400rC glRPO/cW0wcKpNKUoXB4xRHn0SE+Dl6Cmw1aY= Received: by 10.204.75.193 with SMTP id z1mr10527967bkj.214.1296782591898; Thu, 03 Feb 2011 17:23:11 -0800 (PST) MIME-Version: 1.0 Received: by 10.204.75.78 with HTTP; Thu, 3 Feb 2011 17:22:51 -0800 (PST) In-Reply-To: References: From: Brandon Williams Date: Thu, 3 Feb 2011 19:22:51 -0600 Message-ID: Subject: Re: Problems with Python Stress Test To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001485f78f6ebdb621049b6ab90d X-Virus-Checked: Checked by ClamAV on apache.org --001485f78f6ebdb621049b6ab90d Content-Type: text/plain; charset=ISO-8859-1 On Thu, Feb 3, 2011 at 7:02 PM, Sameer Farooqui wrote: > Hi guys, > > I was playing around with the stress.py test this week and noticed a few > things. > > 1) Progress-interval does not always work correctly. I set it to 5 in the > example below, but am instead getting varying intervals: > Generally indicates that the client machine is being overloaded in my experience. 2) The key_rate and op_rate doesn't seem to be calculated correctly. Also, > what is the difference between the interval_key_rate and the > interval_op_rate? For example in the example above, the first row shows 6662 > keys inserted in 5 seconds and 6662 / 5 = 1332, which matches the > interval_op_rate. > There should be no difference unless you're doing range slices, but IPC timing makes them vary somewhat. 3) If I write x KB to Cassandra with py_stress, the used disk space doesn't > grow by x after the test. In the example below I tried to write 500,000 keys > * 32 bytes * 5 columns = 78,125 kilobytes of data to the database. When I > checked the amount of disk space used after the test it actually grew by > 2,684,920 - 2,515,864 = 169,056 kilobytes. Is this because perhaps the > commit log got duplicate copies of the data as the SSTables? > Commitlogs could be part of it, you're not factoring in the column names, and then there's index and bloom filter overhead. Use contrib/stress on 0.7 instead. -Brandon --001485f78f6ebdb621049b6ab90d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
On Thu, Feb 3, 2011 at 7:02 PM, Sameer Farooqui = <cassandral= abs@gmail.com> wrote:
Hi guys,

I was playing around with the stress.py test this week and = noticed a few things.

1) Progress-interval does not always work corr= ectly. I set it to 5 in the example below, but am instead getting varying i= ntervals:

Generally indicates that the client machin= e is being overloaded in my experience.=A0

2) The key_rate and op_rate doesn't seem to be calculated correctly. Al= so, what is the difference between the interval_key_rate and the interval_o= p_rate? For example in the example above, the first row shows 6662 keys ins= erted in 5 seconds and 6662 / 5 =3D 1332, which matches the=A0 interval_op_= rate.

There should be no difference unless you&#= 39;re doing range slices, but IPC timing makes them vary somewhat.=A0
=

3) If I write x KB to Cassandra with py_stress, the used disk space doesn&#= 39;t grow by x after the test. In the example below I tried to write 500,00= 0 keys * 32 bytes * 5 columns =3D 78,125 kilobytes of data to the database.= When I checked the amount of disk space used after the test it actually gr= ew by 2,684,920 - 2,515,864 =3D 169,056 kilobytes. Is this because perhaps = the commit log got duplicate copies of the data as the SSTables?

Commitlogs could be part of it, you're= not factoring in the column names, and then there's index and bloom fi= lter overhead.
=A0
Use contrib/stress on 0.7 instead.

-Brandon
--001485f78f6ebdb621049b6ab90d--