Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 98827 invoked from network); 29 Oct 2009 10:20:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Oct 2009 10:20:56 -0000 Received: (qmail 7051 invoked by uid 500); 29 Oct 2009 10:20:55 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 7017 invoked by uid 500); 29 Oct 2009 10:20:55 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 7008 invoked by uid 99); 29 Oct 2009 10:20:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2009 10:20:55 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chris.were@gmail.com designates 209.85.211.191 as permitted sender) Received: from [209.85.211.191] (HELO mail-yw0-f191.google.com) (209.85.211.191) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2009 10:20:52 +0000 Received: by ywh29 with SMTP id 29so1493503ywh.32 for ; Thu, 29 Oct 2009 03:20:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=cGDIyIgMNroGb/RsN5OrYaWE/cDcrAtCdanb6/zkAzw=; b=uuBupafh7uulrPUdaFUdhYQhVGCoGj3bLjKYkcImMTqMo9XrGv15FNJkIpv6uZC8/h 8VLw/j12h4WXj7/tHdG8Xl7ur5VMdWXOS2OBRVk2DAcbANW8MgSRM2z9vHGW+gPV5FTt NFA+c2EPKN4JzWKOlR2DwJFZ17RL4d7dKcmo4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:in-reply-to:references:from:date:message-id :subject:to:content-type; b=b3EDk6mysYRmXICcKLmD8fsUv30qCLhT8rkFY+5h6wNKTromdy9bifKGREJ17HgMan m1qngI/PELyf8CkSXSY3CXz6ZeSljuPdvAtgs3PhAeOKyAKhmbm/QhBonLl4NH+h76eA rdDFyov1G4boEQJQqJMDrtfjx5ISul2d7n100= MIME-Version: 1.0 Received: by 10.90.37.36 with SMTP id k36mr1230590agk.111.1256811630229; Thu, 29 Oct 2009 03:20:30 -0700 (PDT) Reply-To: chris@chriswere.com In-Reply-To: References: <4ACA1D6D.9000303@rightscale.com> From: Chris Were Date: Thu, 29 Oct 2009 20:50:10 +1030 Message-ID: <35bb42690910290320g1538e9dfveedd9e46c08852c7@mail.gmail.com> Subject: Re: backing up data from cassandra To: cassandra-user@incubator.apache.org Content-Type: multipart/alternative; boundary=0016361e7b48c577a80477104270 --0016361e7b48c577a80477104270 Content-Type: text/plain; charset=ISO-8859-1 Is it possible to only backup selected column families? On Wed, Oct 7, 2009 at 8:15 AM, Jonathan Ellis wrote: > I don't really see "nodeprobe snapshot" and "mv snapshotdir/* livedir" > as all that much harder, but maybe that's just me. > > for a cluster, just add dsh. > > -Jonathan > > On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk wrote: > > Sure not as easy as a "pg_dump db > dump.sql" and "psql db < dump.sql" > > though. Oh well. > > > > > > > > On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau wrote: > >> Thanks for the replies guys. It sounds like restoration via snapshots > >> + some application-side logic to sanity check/repair any data around > >> the snapshot time is the way to go. > >> > >> Edmond > >> > >> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis > wrote: > >>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken < > tve@rightscale.com> wrote: > >>>> Isn't the question about how you back up a cassandra cluster, not a > >>>> single node? > >>> > >>> Sure, but the generalization is straightforward. :) > >>> > >>>> Can you snapshot the various nodes at different times or do > >>>> they need to be synchronized? > >>> > >>> The closer the synchronization, the more consistent they will be. > >>> (Since Cassandra is designed around eventual consistency, there's some > >>> flexibility here. Conversely, there's no way to tell the system > >>> "don't accept any more writes until the snapshot is done.") > >>> > >>>> Is there a minimal set of nodes that are > >>>> sufficient to back up? > >>> > >>> Assuming your replication is 100% up to date, backing up every N nodes > >>> where N is the replication factor could be adequate in theory, but I > >>> wouldn't recommend trying to be clever like that, since if you > >>> "restored" from backup like that your system would be in a degraded > >>> state and vulnerable to any of the restored nodes failing. > >>> > >>> -Jonathan > >>> > >> > > > > > > > > -- > > Joe Van Dyk > > http://fixieconsulting.com > > > --0016361e7b48c577a80477104270 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Is it possible to only backup selected column families?

On Wed, Oct 7, 2009 at 8:15 AM, Jonathan Ellis <jbellis@gmail.com&= gt; wrote:
I don't really see "nodeprobe snap= shot" and "mv snapshotdir/* livedir"
as all that much harder, but maybe that's just me.

for a cluster, just add dsh.

-Jonathan

On Tue, Oct 6, 2009 at 3:42 PM, Joe Van Dyk <joevandyk@gmail.com> wrote:
> Sure not as easy as a "pg_dump db > dump.sql" and "p= sql db < dump.sql"
> though. =A0Oh well.
>
>
>
> On Tue, Oct 6, 2009 at 11:28 AM, Edmond Lau <edmond@ooyala.com> wrote:
>> Thanks for the replies guys. =A0It sounds like restoration via sna= pshots
>> + some application-side logic to sanity check/repair any data arou= nd
>> the snapshot time is the way to go.
>>
>> Edmond
>>
>> On Mon, Oct 5, 2009 at 10:15 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
>>> On Mon, Oct 5, 2009 at 11:23 AM, Thorsten von Eicken <tve@rightscale.com> wrote:
>>>> Isn't the question about how you back up a cassandra c= luster, not a
>>>> single node?
>>>
>>> Sure, but the generalization is straightforward. :)
>>>
>>>> Can you snapshot the various nodes at different times or d= o
>>>> they need to be synchronized?
>>>
>>> The closer the synchronization, the more consistent they will = be.
>>> (Since Cassandra is designed around eventual consistency, ther= e's some
>>> flexibility here. =A0Conversely, there's no way to tell th= e system
>>> "don't accept any more writes until the snapshot is d= one.")
>>>
>>>> Is there a minimal set of nodes that are
>>>> sufficient to back up?
>>>
>>> Assuming your replication is 100% up to date, backing up every= N nodes
>>> where N is the replication factor could be adequate in theory,= but I
>>> wouldn't recommend trying to be clever like that, since if= you
>>> "restored" from backup like that your system would b= e in a degraded
>>> state and vulnerable to any of the restored nodes failing.
>>>
>>> -Jonathan
>>>
>>
>
>
>
> --
> Joe Van Dyk
> http://fixiec= onsulting.com
>

--0016361e7b48c577a80477104270--