Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CC711D350 for ; Sat, 25 Aug 2012 00:56:58 +0000 (UTC) Received: (qmail 66842 invoked by uid 500); 25 Aug 2012 00:56:55 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 66776 invoked by uid 500); 25 Aug 2012 00:56:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 66766 invoked by uid 99); 25 Aug 2012 00:56:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 25 Aug 2012 00:56:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of synfinatic@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-ob0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 25 Aug 2012 00:56:49 +0000 Received: by obbwc20 with SMTP id wc20so6144443obb.31 for ; Fri, 24 Aug 2012 17:56:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=8jvEaczLqxl2smlITluBk20hi67SGgJcWFQHdh1wqto=; b=uh15OV1Nje4js2WegyHdvuwcdkjllpkPDWl8Tcu4nb1YwmmBgMe1lMASasvk1zeRH6 P38GFxMIGN/aX5IH6GXQvasn1ws62BokkCZC5xAcyua0/t90nIELO/sggYDqarKlYXoK JemJkMbz5caMuMm6/cVgqHsMJSyP0JHCeug2O9+hbMMc++O929ZFjZQ6PKaR85SCmRVl jJD9oY8IZu8gl245nQmXZQRkZOKOubAmlKpNsRwh2Z909l5HBlN3JwMj7w5VnBRrhPKy TMkhl4PHdo7NRB17B0YlnIlpZiPlvnElV6wIBhyhJJ/QfAAslgZIvPgaWIqU9TUf7Uzu uuAA== Received: by 10.182.231.6 with SMTP id tc6mr5359030obc.63.1345856187775; Fri, 24 Aug 2012 17:56:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.192.10 with HTTP; Fri, 24 Aug 2012 17:56:07 -0700 (PDT) From: Aaron Turner Date: Fri, 24 Aug 2012 17:56:07 -0700 Message-ID: Subject: optimizing use of sstableloader / SSTableSimpleUnsortedWriter To: cassandra users Content-Type: text/plain; charset=ISO-8859-1 So I've read: http://www.datastax.com/dev/blog/bulk-loading Are there any tips for using sstableloader / SSTableSimpleUnsortedWriter to migrate time series data from a our old datastore (PostgreSQL) to Cassandra? After thinking about how sstables are done on disk, it seems best (required??) to write out each row at once. Ie: if each row == 1 years worth of data and you have say 30,000 rows, write one full row at a time (a full years worth of data points for a given metric) rather then 1 data point for 30,000 rows. Any other tips to improve load time or reduce the load on the cluster or subsequent compaction activity? All my CF's I'll be writing to use compression and leveled compaction. Right now my Cassandra data store has about 4 months of data and we have 5 years of historical (not sure yet how much we'll actually load yet, but minimally 1 years worth). Thanks! -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"