incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Molinaro <antho...@alumni.caltech.edu>
Subject Re: backing up data from cassandra
Date Mon, 05 Oct 2009 16:10:48 GMT
I assume the server also needs to be stopped while your are swapping
files, but what about if you have a cluster of several servers and
need to restore.  Is the process to shutdown all the servers, move
the files and restart?  Or can you you do it one at a time.  (I assume
one at a time might mean a lot of read-repair work happening, so not
a good idea).

Also, is it best to flush_binary (which I think flushes in memory
tables to disk), and compact prior to snapshotting?

-Anthony

On Mon, Oct 05, 2009 at 08:09:48AM -0500, Jonathan Ellis wrote:
> bin/nodeprobe snapshot
> 
> to restore, move the snapshot sstables from the snapshot location to
> the live data location (e.g. with dsh).
> 
> note that the 0.4 branch, which will become 0.4.1, automatically
> flushes each columnfamily when you ask for a snapshot of the table, so
> you don't have to do that manually anymore.
> 
> On Mon, Oct 5, 2009 at 8:05 AM, Joe Van Dyk <joevandyk@gmail.com> wrote:
> > How do you take the snapshot?  What's the restore process?
> >
> > On Mon, Oct 5, 2009 at 5:22 AM, Jonathan Ellis <jbellis@gmail.com> wrote:
> >> You can take a snapshot and either leave it in place indefinitely or
> >> throw it into your existing backup ecosystem.  That's your best option
> >> for backup no matter which kind of partitioner you're using.
> >>
> >> -Jonathan
> >>
> >> On Mon, Oct 5, 2009 at 12:52 AM, Edmond Lau <edmond@ooyala.com> wrote:
> >>> For folks who are using or considering using cassandra in their
> >>> production systems, what do you use for backups?
> >>>
> >>> With HBase, one could potentially write a mapreduce to perform a row
> >>> scan of the entire table (restricted to some historical timestamp to
> >>> get a consistent view) and export the data to hdfs.  With Cassandra,
> >>> if you're using an ordered partitioner, a similar mechanism could be
> >>> built over a key range scan.
> >>>
> >>> With a random partitioner, though, there's no api to iterate through
> >>> all existing keys.  Why not?
> >>>
> >>> Edmond
> >>>
> >>
> >
> >
> >
> > --
> > Joe Van Dyk
> > http://fixieconsulting.com
> >

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <anthonym@alumni.caltech.edu>

Mime
View raw message