From cassandra-user-return-830-apmail-incubator-cassandra-user-archive=incubator.apache.org@incubator.apache.org Tue Oct 06 21:47:28 2009 Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 83391 invoked from network); 6 Oct 2009 21:47:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 6 Oct 2009 21:47:28 -0000 Received: (qmail 18511 invoked by uid 500); 6 Oct 2009 21:47:27 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 18491 invoked by uid 500); 6 Oct 2009 21:47:27 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 18482 invoked by uid 99); 6 Oct 2009 21:47:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Oct 2009 21:47:27 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jbellis@gmail.com designates 74.125.78.144 as permitted sender) Received: from [74.125.78.144] (HELO ey-out-1920.google.com) (74.125.78.144) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Oct 2009 21:47:16 +0000 Received: by ey-out-1920.google.com with SMTP id 5so858247eyb.8 for ; Tue, 06 Oct 2009 14:45:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=qEUemfjs0wGD5CMYT6GjAYZLzJL/72Bus/6HyVVaLtQ=; b=uIxgPNJEPgTKLnN+Vh3WExhclfRofPT7UfhuUqrqIHC9BtqXFGc9HM6ukchw2R7RRI mf1HqOXnXdvftDNAe1uvfb63JrSkUDNr1lw1+i2MJy4vu6+WNlVZns68cwLxyQonUU2G +ivpDh/dDKMZhUjzT9Qu69OEXDX5a5YUCFp1E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=DteKf+Ozav0/FZ/Vjr6+eALN6Hp1ujf+gmI3Oe8JB2cVWWsALJrJbLO6FD1zxwV8KB UjfmLsuJXstqPMLmTQWCjmDyFKCpYvJLIuEoi9zUeGRSshXT0Mz0uO0TKdNgGPxyOwYf 4XvFWwOIyMCz5JcuAFJVqYtTYajiL+IzANlzI= MIME-Version: 1.0 Received: by 10.216.2.19 with SMTP id 19mr435430wee.68.1254865556241; Tue, 06 Oct 2009 14:45:56 -0700 (PDT) In-Reply-To: References: <4ACA1D6D.9000303@rightscale.com> Date: Tue, 6 Oct 2009 16:45:56 -0500 Message-ID: Subject: Re: backing up data from cassandra From: Jonathan Ellis To: cassandra-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I don't really see "nodeprobe snapshot" and "mv snapshotdir/* livedir" as all that much harder, but maybe that's just me. for a cluster, just add dsh. -Jonathan On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk wrote: > Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql" > though. =A0Oh well. > > > > On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau wrote: >> Thanks for the replies guys. =A0It sounds like restoration via snapshots >> + some application-side logic to sanity check/repair any data around >> the snapshot time is the way to go. >> >> Edmond >> >> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis wrot= e: >>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken wrote: >>>> Isn't the question about how you back up a cassandra cluster, not a >>>> single node? >>> >>> Sure, but the generalization is straightforward. :) >>> >>>> Can you snapshot the various nodes at different times or do >>>> they need to be synchronized? >>> >>> The closer the synchronization, the more consistent they will be. >>> (Since Cassandra is designed around eventual consistency, there's some >>> flexibility here. =A0Conversely, there's no way to tell the system >>> "don't accept any more writes until the snapshot is done.") >>> >>>> Is there a minimal set of nodes that are >>>> sufficient to back up? >>> >>> Assuming your replication is 100% up to date, backing up every N nodes >>> where N is the replication factor could be adequate in theory, but I >>> wouldn't recommend trying to be clever like that, since if you >>> "restored" from backup like that your system would be in a degraded >>> state and vulnerable to any of the restored nodes failing. >>> >>> -Jonathan >>> >> > > > > -- > Joe Van Dyk > http://fixieconsulting.com >