Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 48421 invoked from network); 20 Mar 2010 18:27:54 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 Mar 2010 18:27:54 -0000 Received: (qmail 62660 invoked by uid 500); 20 Mar 2010 15:41:14 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 62609 invoked by uid 500); 20 Mar 2010 15:41:14 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 62601 invoked by uid 99); 20 Mar 2010 15:41:14 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 20 Mar 2010 15:41:14 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of goffinet@digg.com designates 209.85.211.195 as permitted sender) Received: from [209.85.211.195] (HELO mail-yw0-f195.google.com) (209.85.211.195) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 20 Mar 2010 15:41:04 +0000 Received: by ywh33 with SMTP id 33so1773530ywh.11 for ; Sat, 20 Mar 2010 08:40:43 -0700 (PDT) Received: by 10.101.108.5 with SMTP id k5mr10765319anm.122.1269099643137; Sat, 20 Mar 2010 08:40:43 -0700 (PDT) Received: from [192.168.1.14] (99-8-186-71.lightspeed.snfcca.sbcglobal.net [99.8.186.71]) by mx.google.com with ESMTPS id 6sm863745ywd.8.2010.03.20.08.40.41 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 20 Mar 2010 08:40:42 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1077) Subject: Re: Digg's data model From: Chris Goffinet In-Reply-To: <240826b41003200153p763258edna452b8082b4c7c5f@mail.gmail.com> Date: Sat, 20 Mar 2010 08:40:40 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <240826b41003200153p763258edna452b8082b4c7c5f@mail.gmail.com> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1077) X-Virus-Checked: Checked by ClamAV on apache.org > 5. Backups : If there is a 4 or 5 TB cassandra cluster what do you = recommend the backup scenario's could be? Worst case scenario (total failure) we opted to do global snapshots = every 24 hours. This creates hard links to SSTables on each node. We = copy those SSTables to HDFS on daily basis. We also wrote a patch to log = all events going into the commit log to be written to Scribe so we can = have a rolling commit log into HDFS. So in the event that entire cluster = corrupts, we can take the last 24 hours snapshot + the commit log right = after last snapshot and get the cluster into the last known good state. -Chris=