Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 51211 invoked from network); 13 Apr 2010 18:53:25 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Apr 2010 18:53:25 -0000 Received: (qmail 4675 invoked by uid 500); 13 Apr 2010 18:53:24 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 4656 invoked by uid 500); 13 Apr 2010 18:53:24 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 4648 invoked by uid 99); 13 Apr 2010 18:53:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Apr 2010 18:53:24 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of scottblanc@gmail.com designates 209.85.212.44 as permitted sender) Received: from [209.85.212.44] (HELO mail-vw0-f44.google.com) (209.85.212.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Apr 2010 18:53:16 +0000 Received: by vws11 with SMTP id 11so1006919vws.31 for ; Tue, 13 Apr 2010 11:52:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=iA3sG5IqEmsS1Gk/3nns72X4Kaw+N0kFlvMH+zXG590=; b=OOSzwU7Srp2PIL0mQK5BlcqDjiabwWHycLVl2bW0OC2tBtTj6MkC8nBXKRKxFt3gBC 6pCP053ddOMEyFFHd23vBNEFN0evoZWMnK4ethS0pGzMkjMQFH5LKl0NItCwDtVeOng5 7qZnPoEOgUPSLHvfzbVi2wkUrfu94rNQdi/wM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=giNBJGg1GJa5VH8EWqZ7jEVvnVVatzDiSGvSz10njMIFHd3isTUqPBcGNg6XsvXlYO n58P1M4gPNdsrCRzaCMZSF4FwYVgMb28YkW1KkPQbbaWaDHQa0ykmhxh8M572P3ZbRZe hD3jKF+e3Cl0o0kDQ/xkYrM2l/hN+gQqFrXV0= MIME-Version: 1.0 Received: by 10.220.91.3 with HTTP; Tue, 13 Apr 2010 11:52:55 -0700 (PDT) In-Reply-To: References: <61401.54585.qm@web111713.mail.gq1.yahoo.com> Date: Tue, 13 Apr 2010 11:52:55 -0700 Received: by 10.220.62.135 with SMTP id x7mr175559vch.127.1271184775684; Tue, 13 Apr 2010 11:52:55 -0700 (PDT) Message-ID: Subject: Re: Worst case #iops to read a row From: Scott White To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=e0cb4e887a4bffa390048422c48e X-Virus-Checked: Checked by ClamAV on apache.org --e0cb4e887a4bffa390048422c48e Content-Type: text/plain; charset=ISO-8859-1 > Do you understand you are assuming there have been no compactions, > which would be extremely bad practice given this number of SSTables? > A major compaction, as would be best practice given this volume, would > result in 1 SSTable per CF per node. One. Similarly, you are > assuming the update is only on the last replica checked, but the > system is going to read and write the first replica (the node that > actually has that range based on its token) first in almost all > situations. > > Not worst case? If 'we' are coming up with arbitrarily bad > situations, why not assume 1 row per SSTable, lots of tombstones, in > addition to no compactions? Why not assume RF=100? Why not assume > node failures right in the middle of your query? The interesting > question is not 'how bad can this get if you configure and operate > things really badly?', but 'how bad can this get if you configure and > operate things according to best practices?'. > Agreed. Doing a worst case complexity analysis is tricky because a) you need to know what best practices are and b) you have to know when using best practices what is the worst case "snapshot" of a healthy cluster to analyze. For example, major compactions happen periodically and so iops should degrade all the way up until the next major compaction. It's a very interesting question though and I would love to see this pursued further. Scott --e0cb4e887a4bffa390048422c48e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Do you understand you are assuming there have been no compactions,
which would be extremely bad practice given this number of SSTables?
A major compaction, as would be best practice given this volume, would
result in 1 SSTable per CF per node. =A0One. =A0Similarly, you are
assuming the update is only on the last replica checked, but the
system is going to read and write the first replica (the node that
actually has that range based on its token) first in almost all
situations.

Not worst case? =A0If 'we' are coming up with arbitrarily bad
situations, why not assume 1 row per SSTable, lots of tombstones, in
addition to no compactions? =A0Why not assume RF=3D100? =A0Why not assume node failures right in the middle of your query? =A0The interesting
question is not 'how bad can this get if you configure and operate
things really badly?', but 'how bad can this get if you configure a= nd
operate things according to best practices?'.

= Agreed. Doing a worst case complexity analysis is tricky because a) you nee= d to know what best practices are and b) you have to know when using best p= ractices what is the worst case "snapshot" of a healthy cluster t= o analyze. For example, major compactions happen periodically and so iops s= hould degrade all the way up until the next major compaction. It's a ve= ry interesting question though and I would love to see this pursued further= .

Scott
--e0cb4e887a4bffa390048422c48e--