Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 191B010A35 for ; Mon, 29 Apr 2013 11:09:50 +0000 (UTC) Received: (qmail 21450 invoked by uid 500); 29 Apr 2013 11:09:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 21110 invoked by uid 500); 29 Apr 2013 11:09:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 21081 invoked by uid 99); 29 Apr 2013 11:09:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Apr 2013 11:09:44 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of soverton@acunu.com designates 209.85.223.180 as permitted sender) Received: from [209.85.223.180] (HELO mail-ie0-f180.google.com) (209.85.223.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Apr 2013 11:09:38 +0000 Received: by mail-ie0-f180.google.com with SMTP id to1so7191546ieb.11 for ; Mon, 29 Apr 2013 04:09:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=acunu.com; s=google; h=x-received:mime-version:reply-to:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=2I/i3i7I5RXN3VTQcXdielhrkAfm9kEa+nMIMCa0SeA=; b=cAHjRE6VzxM1nikeMmCHjXMFctsgvMNj65C8Q9KHkTrNtQb38RU5EiWeWnkpc4xSx8 kB13mgYcJTkzm53/AeDfuMuSJVUCEjXH5i1yuEfodRT+5jw5CoUfpB0ie4+J0Dvqr7yx vr+9Otoi2krB6jMzw9cJbAxz6ms3UsBHDdAyU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:reply-to:in-reply-to:references:from:date :message-id:subject:to:content-type:x-gm-message-state; bh=2I/i3i7I5RXN3VTQcXdielhrkAfm9kEa+nMIMCa0SeA=; b=Odq29cf5xXj6zA8bKEviPJfiPRdETKJv1RIEx0ex6/U3f26IMmB7WiSE0ImVAPSO0S OtI1rJjNpESIeGONJX4wl+sjjv+a2yVExhCWMQUQZD9kAXXz1+mAPtA7x275LXD58Fxa tWqzkQFRUsbOg9oBsmWuSnSp9VeyQZgfw2H8qn5lv9Z2PEtydCOSGRKlmDErhxf8pi9y 7vQ7FKwGM4FtuZnFjAKwN4ggzOBZrYNyVyW7BzvEVk4mC/trT+n6WRDL9RGICAGbXKxy BX0DROhx3GjMY+NLnZUkqeV1NnVEBAUPke4YNr5cQDhCG27K2chpClMF5Z0WVmuYk3Tx QlhA== X-Received: by 10.50.6.99 with SMTP id z3mr626111igz.51.1367233757678; Mon, 29 Apr 2013 04:09:17 -0700 (PDT) MIME-Version: 1.0 Reply-To: sam@acunu.com Received: by 10.64.139.5 with HTTP; Mon, 29 Apr 2013 04:08:57 -0700 (PDT) In-Reply-To: References: From: Sam Overton Date: Mon, 29 Apr 2013 12:08:57 +0100 Message-ID: Subject: Re: cassandra-shuffle time to completion and required disk space To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=e89a8f3b9c2373a04604db7dea5f X-Gm-Message-State: ALoCoQmLRf6slh4RiVmRETvYUxukjZAAqzJDNYcDoiA5/0DJTXMv8vO6LxmEauHTPS8zteyUsdry X-Virus-Checked: Checked by ClamAV on apache.org --e89a8f3b9c2373a04604db7dea5f Content-Type: text/plain; charset=ISO-8859-1 An alternative to running shuffle is to do a rolling bootstrap/decommission. You would set num_tokens on the existing hosts (and restart them) so that they split their ranges, then bootstrap in N new hosts, then decommission the old ones. On 28 April 2013 22:21, John Watson wrote: > The amount of time/space cassandra-shuffle requires when upgrading to > using vnodes should really be apparent in documentation (when some is made). > > Only semi-noticeable remark about the exorbitant amount of time is a > bullet point in: http://wiki.apache.org/cassandra/VirtualNodes/Balance > > "Shuffling will entail moving a lot of data around the cluster and so has > the potential to consume a lot of disk and network I/O, and to take a > considerable amount of time. For this to be an online operation, the > shuffle will need to operate on a lower priority basis to other streaming > operations, and should be expected to take days or weeks to complete." > > We tried running shuffle on a QA version of our cluster and 2 things were > brought to light: > - Even with no reads/writes it was going to take 20 days > - Each machine needed enough free diskspace to potentially hold the > entire cluster's sstables on disk > > Regards, > > John > -- Sam Overton Acunu | http://www.acunu.com | @acunu --e89a8f3b9c2373a04604db7dea5f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
An alternative to running shuffle is to do a rolling boots= trap/decommission. You would set num_tokens on the existing hosts (and rest= art them) so that they split their ranges, then bootstrap in N new hosts, t= hen=A0decommission=A0the old ones.



On 28 April 2013 22:21, John Watson <john@disqus.com> wrote= :
The amount of time/space cassandra-shuffle requires when u= pgrading to using vnodes should really be apparent in documentation (when s= ome is made).

Only semi-noticeable remark about the exor= bitant amount of time is a bullet point in:=A0http://wiki.apache.o= rg/cassandra/VirtualNodes/Balance

"Shuffling will entail moving a lot of data around= the cluster and so has the potential to consume a lot of disk and network = I/O, and to take a considerable amount of time. For this to be an online op= eration, the shuffle will need to operate on a lower priority basis to othe= r streaming operations, and should be expected to take days or weeks to com= plete."

We tried running shuffle on a QA version of our cluster= and 2 things were brought to light:
=A0- Even with no reads/writ= es it was going to take 20 days
=A0- Each machine needed enough f= ree diskspace to potentially hold the entire cluster's sstables on disk=

Regards,

John



--
Sam= Overton
Acunu |=A0http://www.acunu.com=A0| @acunu
--e89a8f3b9c2373a04604db7dea5f--