Return-Path: X-Original-To: apmail-cassandra-dev-archive@www.apache.org Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 490F2D50D for ; Mon, 27 Aug 2012 20:54:36 +0000 (UTC) Received: (qmail 3486 invoked by uid 500); 27 Aug 2012 20:54:35 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 3421 invoked by uid 500); 27 Aug 2012 20:54:34 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 3411 invoked by uid 99); 27 Aug 2012 20:54:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Aug 2012 20:54:34 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of eevans@acunu.com designates 209.85.217.172 as permitted sender) Received: from [209.85.217.172] (HELO mail-lb0-f172.google.com) (209.85.217.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Aug 2012 20:54:28 +0000 Received: by lbky2 with SMTP id y2so2179783lbk.31 for ; Mon, 27 Aug 2012 13:54:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=acunu.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=CntxRyGkDNXRDE+qT/Lo/BticS2gTnX6TbI1Vuzf6OA=; b=P/TRk7mlcOOX2vmmvzN+EKn+mDWinN9RlCOIRiq1dFFWBDXRSgRSDru2XgB4zgmpDM e1MBJJOznnYEjebuNtnNt3u+g5A9qjnoevqK2TbLGTQ1JxaWCIRYz5Vj3cbemgSNgrSl zFChfwfIEWJDeVz0fSb1qM4Mm93DOms0gJI4M= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=CntxRyGkDNXRDE+qT/Lo/BticS2gTnX6TbI1Vuzf6OA=; b=Udyp8Loo6mAQ6xiOkSaRpQ7RUyTtyA+tLiQ+EkCa70hfywg6WDrkMHQar9DV61YJRP Lsz4gnO2dlPb4SQ0cuJIYbIS/f1c7x1XnzvyZ2YfaRoiEwhwvJ7kLuqDvyppSuT7JTyy LkpdxHec2F3t18yhh+ItpHjG2mVz9HvF4PALodQ9x5V20rAKdAzf43wTr7kG6kqhiWT0 7pKjS/cNsdyYYPiUGm9svkwRLiXlUptdS/clSIGaQO2iU5kjSMCp12VTHwiz4gAmaUSk HZurl3xbHhA9kcX6vbgXZrtizhgvSn2dPqSt0qsObHY6qfYuXBfUFq2pSbKPjTLCbVSZ bteQ== MIME-Version: 1.0 Received: by 10.152.124.76 with SMTP id mg12mr16077152lab.10.1346100848147; Mon, 27 Aug 2012 13:54:08 -0700 (PDT) Received: by 10.112.83.162 with HTTP; Mon, 27 Aug 2012 13:54:07 -0700 (PDT) In-Reply-To: References: Date: Mon, 27 Aug 2012 15:54:07 -0500 Message-ID: Subject: Re: Upgrade path for virtual nodes From: Eric Evans To: dev@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQlknN9C7uLEMWv+rEPNfbFwTlBvARVn1XAnkiu285PXHcrMLPjHAA3UV+rcGoYdWVCOYU3Z On Fri, Aug 24, 2012 at 3:39 PM, Eric Evans wrote: > On Fri, Aug 24, 2012 at 11:27 AM, Jonathan Ellis wrote: >> On Fri, Aug 24, 2012 at 11:23 AM, Eric Evans wrote: >>> Actually, now that I think about it, I'd probably drop the entire >>> notion of a "coordinator", and write the respective entiries into a >>> column family in the system keyspaces. Each system could then work >>> through their respective queue of relocations at their own pace. >> >> Sounds reasonable. > > OK, then unless someone steps forward with a better idea, I'll proceed > with this approach. I've updated the ticket[1] with a link to a patch[2] that implements what I was thinking here. Each node implements a scheduler that periodically looks in a system table for token ranges that should be relocated to it. As a safety measure, it will skip new transfers if the actual number of tokens exceeds num_tokens by 10% or more (giving slower nodes a chance to catch up, if needed). The periodic scheduler can be enabled and disabled using JMX. What remains is to create the administrative tool, something to calculate the token moves and populate the tables with the respective entries. Any thoughts on this? Should this be something baked into nodetool, or a separate utility? Can we add the entries directly, or should this be done via JMX? Also, do we have an exact date for the freeze yet? I assume I have at least until Sylvain returns from holiday. :) Thoughts, comments, ideas? [1]: https://issues.apache.org/jira/browse/CASSANDRA-4443 [2]: https://github.com/acunu/cassandra/compare/top-bases/p/4443/050_process_queued_xfers...p/4443/050_process_queued_xfers.diff -- Eric Evans Acunu | http://www.acunu.com | @acunu