Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 63626 invoked from network); 28 Sep 2010 09:20:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 28 Sep 2010 09:20:17 -0000 Received: (qmail 22200 invoked by uid 500); 28 Sep 2010 09:20:15 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 22090 invoked by uid 500); 28 Sep 2010 09:20:12 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 21981 invoked by uid 99); 28 Sep 2010 09:20:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Sep 2010 09:20:11 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of janne.jalkanen@ecyrd.com designates 193.64.5.122 as permitted sender) Received: from [193.64.5.122] (HELO mail.ecyrd.com) (193.64.5.122) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Sep 2010 09:20:02 +0000 Received: from [10.0.1.8] (unknown [83.145.221.73]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.ecyrd.com (Postfix) with ESMTPSA id CE6C097C087 for ; Tue, 28 Sep 2010 12:19:26 +0300 (EEST) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Apple Message framework v1081) Subject: Re: Best strategy for adding new nodes to the cluster From: Janne Jalkanen In-Reply-To: <9E697292-473F-46F9-A41F-463D4C18EF94@duergner.de> Date: Tue, 28 Sep 2010 12:19:16 +0300 Content-Transfer-Encoding: quoted-printable Message-Id: <39CC8664-829B-4DC0-A677-03722F813372@ecyrd.com> References: <9E697292-473F-46F9-A41F-463D4C18EF94@duergner.de> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1081) X-Virus-Checked: Checked by ClamAV on apache.org On 28 Sep 2010, at 08:37, Michael D=FCrgner wrote: >> What do you mean by "running live"? I am also planning to use = cassandra on EC2 using small nodes. Small nodes have 1/4 cpu of the = large ones, 1/4 cost, but I/O is more than 1/4 (amazon does not give = explicit I/O numbers...), so I think 4 small instances should perform = better than 1 large one (and the cost is the same), am I wrong? >=20 > Based on results we saw and what you also find in different sources = around the web, EC2 small instances perform worse than 1/4 regarding IO = performance. Ditto. My tests indicate that while the peak IO performance of small = nodes can be ok (up to 1/2 of large), it degrades over time down to 1/6 = or even less. It seems that Amazon dedicates sufficient bandwidth to = small nodes in the beginning to ensure a smooth and quick boot, but then = throttles down fairly aggressively within a few minutes. This seems to = affect reads more than writes, though. Note also that large instances have over 4x the memory (1.7 GB =3D> 7.5 = GB), and that makes a world of difference (you can have larger caches, = for example). You don't really want to start swapping on the small = instances. (However, small instances are awesome for doing testing and learning how = to manage a cluster.) /Janne=