Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of scottblanc@gmail.com designates
 209.85.212.44 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=giNBJGg1GJa5VH8EWqZ7jEVvnVVatzDiSGvSz10njMIFHd3isTUqPBcGNg6XsvXlYO
         n58P1M4gPNdsrCRzaCMZSF4FwYVgMb28YkW1KkPQbbaWaDHQa0ykmhxh8M572P3ZbRZe
         hD3jKF+e3Cl0o0kDQ/xkYrM2l/hN+gQqFrXV0=
MIME-Version: 1.0
In-Reply-To: <j2pa7fcf8301004131131uf968c8d8rce8715b24927c85b@mail.gmail.com>
References: <61401.54585.qm@web111713.mail.gq1.yahoo.com>
	 <w2ze06563881004090839q7027f8f7hd858018046300029@mail.gmail.com>
	 <x2g7c5131fa1004090923v76d726fdnf105d6e5d4d3c91b@mail.gmail.com>
	 <t2ke06563881004090928k8ea950d1t9d85933cf4b74b50@mail.gmail.com>
	 <l2q8ddbf2ee1004121345s31f6510et2923757962c01932@mail.gmail.com>
	 <x2ie06563881004121351u39e08017zc325f42f533c1eef@mail.gmail.com>
	 <u2w8ddbf2ee1004121627nc6886271jaeb7fcf0d1a53ba@mail.gmail.com>
	 <n2ya7fcf8301004121759kb2edc673ge96ba9885a1b583d@mail.gmail.com>
	 <k2z8ddbf2ee1004131048pbaa3c28sc8d23d9f45038c38@mail.gmail.com>
	 <j2pa7fcf8301004131131uf968c8d8rce8715b24927c85b@mail.gmail.com>
Date: Tue, 13 Apr 2010 11:52:55 -0700
Message-ID: <p2w401b577e1004131152v6d0a3000l8503f095398fed5e@mail.gmail.com>
Subject: Re: Worst case #iops to read a row
From: Scott White <scottblanc@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=e0cb4e887a4bffa390048422c48e

--e0cb4e887a4bffa390048422c48e
Content-Type: text/plain; charset=ISO-8859-1

> Do you understand you are assuming there have been no compactions,
> which would be extremely bad practice given this number of SSTables?
> A major compaction, as would be best practice given this volume, would
> result in 1 SSTable per CF per node.  One.  Similarly, you are
> assuming the update is only on the last replica checked, but the
> system is going to read and write the first replica (the node that
> actually has that range based on its token) first in almost all
> situations.
>
> Not worst case?  If 'we' are coming up with arbitrarily bad
> situations, why not assume 1 row per SSTable, lots of tombstones, in
> addition to no compactions?  Why not assume RF=100?  Why not assume
> node failures right in the middle of your query?  The interesting
> question is not 'how bad can this get if you configure and operate
> things really badly?', but 'how bad can this get if you configure and
> operate things according to best practices?'.
>

Agreed. Doing a worst case complexity analysis is tricky because a) you need
to know what best practices are and b) you have to know when using best
practices what is the worst case "snapshot" of a healthy cluster to analyze.
For example, major compactions happen periodically and so iops should
degrade all the way up until the next major compaction. It's a very
interesting question though and I would love to see this pursued further.

Scott

--e0cb4e887a4bffa390048422c48e
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"m=
argin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); paddin=
g-left: 1ex;">
Do you understand you are assuming there have been no compactions,<br>
which would be extremely bad practice given this number of SSTables?<br>
A major compaction, as would be best practice given this volume, would<br>
result in 1 SSTable per CF per node. =A0One. =A0Similarly, you are<br>
assuming the update is only on the last replica checked, but the<br>
system is going to read and write the first replica (the node that<br>
actually has that range based on its token) first in almost all<br>
situations.<br>
<br>
Not worst case? =A0If &#39;we&#39; are coming up with arbitrarily bad<br>
situations, why not assume 1 row per SSTable, lots of tombstones, in<br>
addition to no compactions? =A0Why not assume RF=3D100? =A0Why not assume<b=
r>
node failures right in the middle of your query? =A0The interesting<br>
question is not &#39;how bad can this get if you configure and operate<br>
things really badly?&#39;, but &#39;how bad can this get if you configure a=
nd<br>
operate things according to best practices?&#39;.<br></blockquote><div><br>=
Agreed. Doing a worst case complexity analysis is tricky because a) you nee=
d to know what best practices are and b) you have to know when using best p=
ractices what is the worst case &quot;snapshot&quot; of a healthy cluster t=
o analyze. For example, major compactions happen periodically and so iops s=
hould degrade all the way up until the next major compaction. It&#39;s a ve=
ry interesting question though and I would love to see this pursued further=
. <br>
<br>Scott<br></div></div>

--e0cb4e887a4bffa390048422c48e--