Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 32330 invoked from network); 22 Mar 2011 20:06:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Mar 2011 20:06:17 -0000 Received: (qmail 74361 invoked by uid 500); 22 Mar 2011 20:06:15 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 74335 invoked by uid 500); 22 Mar 2011 20:06:15 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74327 invoked by uid 99); 22 Mar 2011 20:06:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Mar 2011 20:06:15 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a46.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Mar 2011 20:06:09 +0000 Received: from homiemail-a46.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a46.g.dreamhost.com (Postfix) with ESMTP id 0C7D03E4065 for ; Tue, 22 Mar 2011 13:05:46 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; q=dns; s= thelastpickle.com; b=aefH4RVUoDQgkJySY6huEShKbgFGMNNxXW162Wcg31x G+gq6M2lkDcMIEeL1Xlp6l1SpFlqzN/u9cHc4I1rj+IzM2Ct1uZOU6QWLIGktEec iPLAL8SKM1BzNOhVY5lfHCHG+KEvifgJUgUmfpFgXFCVypX2kMF47p9j6ev+8Dns = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h= content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; s= thelastpickle.com; bh=1rmbbkk+1iRFmfXad3g7AnpwRn0=; b=xmujSnXs8X by6sSY6J0j+DzUkjPtXqizkGEr2R4YUXC7qM5RAzX7x+yty8Tr+1ns7Jf2mbyic5 WLwzuS0AcH+SFa4vbMUKHX5So3hcfiwXvPISgvda6a9OZOc5fU3jiShJ9a1pDfwO 8APGG2Jrbkz2Mbq0ryB9LRVgm44r69F54= Received: from [172.20.10.3] (unknown [121.90.66.89]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a46.g.dreamhost.com (Postfix) with ESMTPSA id AD3663E405B for ; Tue, 22 Mar 2011 13:05:43 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082.1) Subject: Re: cassandra nodes with mixed hard disk sizes From: aaron morton In-Reply-To: Date: Wed, 23 Mar 2011 09:05:37 +1300 Content-Transfer-Encoding: quoted-printable Message-Id: <61D74173-47E6-4A88-A8B8-D335A22D1637@thelastpickle.com> References: <4B0486C1-8DDB-4E74-92A9-3730066EB13A@thelastpickle.com> <2C248810-B3B7-464F-A6E3-441B09199FD2@gmx.net> <31AF35E5-8471-4149-A18A-EB48E50B933A@thelastpickle.com> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1082.1) I probably could have saved myself some time by saying (as Peter and = Edward pointed out) "if you use nodes with different capabilities you = will need treat all nodes as having the lowest spec and that could be a = waste." :) Aaron On 23 Mar 2011, at 07:26, Peter Schuller wrote: >> Wait! maybe this is a quadruple-whammy since we have to account for >> the data being replicated to other nodes. At replication factor 3 = only >> 1/3rd of the data on the node actually belongs in that TokenRange, So >> it is not as simple as having small nodes with smaller ranges, you >> also have to consider nodes around it and somehow balance them out = to. >> (I am not convinced it can be done) >=20 > This is what I was talking about. >=20 > However I forgot about the memtable settings etc actually being global > in that sense as you reiterated (this was presumably what Aaron meant > from the start - I mis-interpreted). Solving that in a way that > doesn't make schema management much more complex might be an > interesting problem. Maybe having a per-node scaling factor for some > of these things would help. >=20 > But that and the RF issue seem like the major concerns. >=20 > You mentioned difficulty w.r.t. not only balancing request amount but > also differing costs per request depending on data sizes - yes, but > that's just a fundamental problem of balancing systems like this. Just > because some node is "twice as fast" using some particular metric > doesn't mean that metric is the only thing of concern for your access > pattern. I don't think Cassandra exacerbates that particularly. >=20 > Virtual nodes or some other method of dispersing data across a ring in > a more flexible way may mitigate or eliminate the RF induced problem > (along with having other nice effects). >=20 > Anyways, I agree that it is advisable to avoid mixing slim and fat > nodes in a cluster. >=20 > --=20 > / Peter Schuller