Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 67584 invoked from network); 9 Jun 2010 20:35:03 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Jun 2010 20:35:03 -0000 Received: (qmail 86250 invoked by uid 500); 9 Jun 2010 20:35:02 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 86220 invoked by uid 500); 9 Jun 2010 20:35:02 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 86212 invoked by uid 99); 9 Jun 2010 20:35:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 20:35:02 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jordan.pittier@gmail.com designates 209.85.211.181 as permitted sender) Received: from [209.85.211.181] (HELO mail-yw0-f181.google.com) (209.85.211.181) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 20:34:55 +0000 Received: by ywh11 with SMTP id 11so5397416ywh.7 for ; Wed, 09 Jun 2010 13:34:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=ccXr1xWgvYLrhOtFpL9AmxX1YANqLrrVfcoRNn55IBE=; b=KLKFKFFVoK3TWQqv658ZRq/eKiqiS0T1GrWwHI5ETFaj9UxJvy8nvjT5YV9K6NqeXE WSzKYAoCCig3WlnUYOqBJQKqBlxCZcyDTCz8NoftzO6MXj5t7t0B9RIxH24MspXj0ebo HUDVd5/bntTkITJuJqB+Emy4/Plsfm6ZNd+PQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:content-type; b=Ne3cJg8pemNVGe47zTD9Qyn4PUS3G2Vh+E2dH7jr/jTN7vqKrAL2pHJmOS716AS0BS xYPpnG6BrqN8IzhFzrSeWUHdZUPM5xhRcceSmJgF8O0h3qGzznq10okIZcQQd+5+Blln uyJjaBMTy0q4u3fMvMi5arls97Nt/sjhhjF94= Received: by 10.100.245.35 with SMTP id s35mr18647909anh.71.1276115673649; Wed, 09 Jun 2010 13:34:33 -0700 (PDT) MIME-Version: 1.0 Sender: jordan.pittier@gmail.com Received: by 10.100.96.13 with HTTP; Wed, 9 Jun 2010 13:34:13 -0700 (PDT) In-Reply-To: References: From: Jordan Pittier - Rezel Date: Wed, 9 Jun 2010 22:34:13 +0200 X-Google-Sender-Auth: ACHhqe0cLsiFWvIpHlONzkrP0hE Message-ID: Subject: Re: Running a very small cluster To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0016e6d27c766b936704889ed518 X-Virus-Checked: Checked by ClamAV on apache.org --0016e6d27c766b936704889ed518 Content-Type: text/plain; charset=ISO-8859-1 Hi, Regarding point c), you should ask your self, "what is good performance for me ?". The read performance mainly depends on how fast your hard drives are and how many rows you can maintain in cache. With such a small cluster, if you want "good" read performance, you better have fast hard drive and quite a lot of memory (depending of the size of you data). Why don't you run the benchmark contrib/stress.py to see what performance do you get ? On Wed, Jun 9, 2010 at 9:59 PM, Per Olesen wrote: > Short question: Do cassandra only *really* shine when running a cluster > with lots of nodes? > > Same question in a lengthy version: > > If what I want to obtain from my cassandra cluster is given as this: > a) protection against data loss if nodes disk-crash > b) good uptime, if servers become unavailable or are taken down for service > c) good read performance > > (notice: I do not need exceptionally good write performance) > > So, if I setup a cluster with 3 nodes only, and set ReplicationFactor=3, > and use QUORUM reads and writes, do I then have a chance of obtaining my > goal? > > Or do I need a cassandra cluster with "lots" of nodes? I have no number for > what I mean with "lots", but I regard 2-3 nodes as NOT being a lot :-) > (I know there are many variables here, but bear with me). > > Here are my thoughts: > > With respect to (a): > RF=3 will mean all (3) nodes with have all data on them, which I think of > as "good enough protection". So, goal reached on that one. > > But what about (b): > With QUOROM reads and writes and RF=3, I can take down one and only one > node at any time, and still be up and running, right? If correct, I guess > this is up to me, if that is okay :) > > Last, on (c): > Given that what we are aiming at is good read performance, does it then > make much sense, to run a cassandra cluster if we only plan on having 3 > nodes? I mean, then there won't be that many nodes to distribute reads to? > > /Per > > --0016e6d27c766b936704889ed518 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi,
Regarding point c), you should ask your self, "what is good pe= rformance for me ?". The read performance mainly depends on how fast y= our hard drives are and how many rows you can maintain in cache. With such = a small cluster, if you want "good" read performance, you better = have fast hard drive and quite a lot of memory (depending of the size of yo= u data).

Why don't you run the benchmark contrib/stress.py t= o see what performance do you get ?

On We= d, Jun 9, 2010 at 9:59 PM, Per Olesen <pol@trifork.com> wrote:
Short question: Do cassandra only *really* = shine when running a cluster with lots of nodes?

Same question in a lengthy version:

If what I want to obtain from my cassandra cluster is given as this:
a) protection against data loss if nodes disk-crash
b) good uptime, if servers become unavailable or are taken down for service=
c) good read performance

(notice: I do not need exceptionally good write performance)

So, if I setup a cluster with 3 nodes only, and set ReplicationFactor=3D3, = and use QUORUM reads and writes, do I then have a chance of obtaining my go= al?

Or do I need a cassandra cluster with "lots" of nodes? I have no = number for what I mean with "lots", but I regard 2-3 nodes as NOT= being a lot :-)
(I know there are many variables here, but bear with me).

Here are my thoughts:

With respect to (a):
RF=3D3 will mean all (3) nodes with have all data on them, which I think of= as "good enough protection". So, goal reached on that one.

But what about (b):
With QUOROM reads and writes and RF=3D3, I can take down one and only one n= ode at any time, and still be up and running, right? If correct, I guess th= is is up to me, if that is okay :)

Last, on (c):
Given that what we are aiming at is good read performance, does it then mak= e much sense, to run a cassandra cluster if we only plan on having 3 nodes?= I mean, then there won't be that many nodes to distribute reads to?
/Per


--0016e6d27c766b936704889ed518--