Return-Path: X-Original-To: apmail-cassandra-dev-archive@www.apache.org Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7CDD29B75 for ; Tue, 20 Mar 2012 15:37:41 +0000 (UTC) Received: (qmail 99612 invoked by uid 500); 20 Mar 2012 15:37:39 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 99581 invoked by uid 500); 20 Mar 2012 15:37:39 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 99535 invoked by uid 99); 20 Mar 2012 15:37:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 15:37:39 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of soverton@acunu.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bk0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 15:37:35 +0000 Received: by bkuw5 with SMTP id w5so169763bku.31 for ; Tue, 20 Mar 2012 08:37:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=acunu.com; s=google; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:content-transfer-encoding; bh=v/WZ9hmMOXEuke8SzjJpS5Xq2ZkpRJGWR49wHukpRUI=; b=RJJ7WvqUMHmcmyxnt8kPZkQyNr/j6ygiFSnL2OzNGTBq35Iadk4+PBPLblXh6BE3NP uhBmkCNzQoMcWZFXotYTdQU9iG7e5TOEqxW6F9DuSuNIONIvDzf0IVO0v2vwrHwaUhX2 RFGB65rTyKj5x8VAaaROsMdXz09ivryv6g25E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:content-transfer-encoding :x-gm-message-state; bh=v/WZ9hmMOXEuke8SzjJpS5Xq2ZkpRJGWR49wHukpRUI=; b=a8ZqJIRAt/OjLJ4nYtfBvNFQfQEaisEmqPD3wY1Eh7kxWLr4z7D/aZK0bOFWFSxCbM VW6ttvITWB1emZQdxWnEFLdkm585CCUPdooUMQ4epsVOuObaexRNTvvUorS3JCBNo7BH 9TIw2yPeQ4+8OFywvSCnQDT0E19bFKQXb4ZLQbkuCP6+zZN6CcJHMIdFP9WiFf37qrih GXoluQAZ+vB7K1gBHNqGs6kMS/Il0j8Sf79Sl8Ci3ViccGnOBVB8AGFdlGbEoXbcteuv b1s2zE9yd7KJ3DuTAcb/kclEayA7qAbH+sxDipqBNSYh4jNBqCYiJY1WUoxHVIh1XL0L 7wgg== MIME-Version: 1.0 Received: by 10.205.132.71 with SMTP id ht7mr148267bkc.19.1332257833467; Tue, 20 Mar 2012 08:37:13 -0700 (PDT) Reply-To: sam@acunu.com Received: by 10.204.188.14 with HTTP; Tue, 20 Mar 2012 08:37:13 -0700 (PDT) In-Reply-To: References: Date: Tue, 20 Mar 2012 15:37:13 +0000 Message-ID: Subject: Re: RFC: Cassandra Virtual Nodes From: Sam Overton To: Peter Schuller Cc: dev@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQlC8MTXofrBPf0HY5yeLrUmkwYbY9NsqSyYoj4QDcMgeVIBFVm9R8pWP/ySATUXQpouWKnU X-Virus-Checked: Checked by ClamAV on apache.org On 19 March 2012 23:41, Peter Schuller wrote: >>> Using this ring bucket in the CRUSH topology, (with the hash function >>> being the identity function) would give the exact same distribution >>> properties as the virtual node strategy that I suggested previously, >>> but of course with much better topology awareness. >> >> I will have to re-read your orignal post. I seem to have missed somethin= g :) > > I did, and I may or may not understand what you mean. > > Are you comparing vnodes + hashing, with CRUSH + pre-partitioning by > hash + identity hash as you traverse down the topology tree? Yes. I was just trying to illustrate that it's not necessary to have CRUSH doing the partitioning and placement of primary replicas. The same functionality can be achieved by having logically separate placement (a ring with virtual nodes) and a replication strategy which implements the CRUSH algorithm for replica placement. I think you agreed with this further down your previous reply anyway, perhaps I was just being too verbose :) The reason I'm trying to make that distinction is because it will be less work than wholesale replacing the entire distribution logic in Cassandra with CRUSH. I'm not sure if that's exactly what your design is suggesting? --=20 Sam Overton Acunu |=A0http://www.acunu.com=A0| @acunu