Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F304D112B0 for ; Fri, 2 May 2014 17:15:16 +0000 (UTC) Received: (qmail 46624 invoked by uid 500); 2 May 2014 17:15:12 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 46591 invoked by uid 500); 2 May 2014 17:15:12 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 46580 invoked by uid 99); 2 May 2014 17:15:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 May 2014 17:15:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tbarbugli@gmail.com designates 209.85.212.180 as permitted sender) Received: from [209.85.212.180] (HELO mail-wi0-f180.google.com) (209.85.212.180) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 May 2014 17:15:08 +0000 Received: by mail-wi0-f180.google.com with SMTP id hi5so1760471wib.1 for ; Fri, 02 May 2014 10:14:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=gYd5brosQG7zVu6lTN62aFejeCdp6hynldua9KTV+ZY=; b=kVrB3Gd3vweYEGbzIXcwzzXRZC5HXUhG05ZnUDGmTOgVgKxGfzEE7NnS98cQkq3Wat Iw69eqDwfutKk1kQf9wLT17aBlOnz14ffp5duHXL9M0YecEKxYcRD8PUTJcIgowHM7QN RVzWt55xGC/3jTWnbXWCugBoJRtk3yB/v8V2lY7diNOoY1GU/s9cFvTNknOhsQtZ3UHe nZFDHMrUwXsxquX4bFRg76LuniW7pcYBSgp1YcWJv2flyIdCQK6eoZ2nW/R9a75ULmQZ 07wRMPAt0yrNbDM+ze0tpYj1NRsuW1ptld/drFfQ14HKxGz86TpVjCxkcUUP9fv/I06o R6mQ== MIME-Version: 1.0 X-Received: by 10.180.211.116 with SMTP id nb20mr3864511wic.5.1399050886298; Fri, 02 May 2014 10:14:46 -0700 (PDT) Received: by 10.227.71.72 with HTTP; Fri, 2 May 2014 10:14:46 -0700 (PDT) In-Reply-To: References: <5363496A.8000006@dg6obo.de> <53634F6B.1040709@gmail.com> <53635362.4090902@dg6obo.de> <53635A86.4030105@openmarket.com> Date: Fri, 2 May 2014 19:14:46 +0200 Message-ID: Subject: Re: Backup procedure From: tommaso barbugli To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a11c26ab419c28104f86deb73 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c26ab419c28104f86deb73 Content-Type: text/plain; charset=UTF-8 In my tests compressing with lzop sstables (with cassandra compression turned on) resulted in approx. 50% smaller files. Thats probably because the chunks of data compressed by lzop are way bigger than the average size of writes performed on Cassandra (not sure how data is compressed but I guess it is done per single cell so unless one stores) 2014-05-02 19:01 GMT+02:00 Robert Coli : > On Fri, May 2, 2014 at 2:07 AM, tommaso barbugli wrote: > >> If you are thinking about using Amazon S3 storage I wrote a tool that >> performs snapshots and backups on multiple nodes. >> Backups are stored compressed on S3. >> https://github.com/tbarbugli/cassandra_snapshotter >> > > https://github.com/JeremyGrosser/tablesnap > > SSTables in Cassandra are compressed by default, if you are re-compressing > them you may just be wasting CPU.. :) > > =Rob > > --001a11c26ab419c28104f86deb73 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
In my tests compressing with lz= op sstables (with cassandra compression turned on) resulted in approx. 50% = smaller files.
Thats probably because the c= hunks of data compressed by lzop are way bigger than the average size of wr= ites performed on Cassandra (not sure how data is compressed but I guess it= is done per single cell so unless one stores)


2014-05-02 19:01 GMT+02:00 Robert Coli <rcoli@event= brite.com>:
=
On Fri, May 2, 2014 at 2:07 AM, = tommaso barbugli <tbarbugli@gmail.com> wrote:
If you are thinking about using Amazon S3= storage I wrote a tool that performs snapshots and backups on multiple nod= es.=C2=A0
Backups are stored compressed on S3. https://github.com/tbarbugli/cas= sandra_snapshotter

http= s://github.com/JeremyGrosser/tablesnap

SSTables in Cassandra are compressed by default, if you= are re-compressing them you may just be wasting CPU.. :)

=3DRob


--001a11c26ab419c28104f86deb73--