Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4CC38D653 for ; Mon, 27 Aug 2012 18:11:01 +0000 (UTC) Received: (qmail 91345 invoked by uid 500); 27 Aug 2012 18:10:59 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 91266 invoked by uid 500); 27 Aug 2012 18:10:58 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 91257 invoked by uid 99); 27 Aug 2012 18:10:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Aug 2012 18:10:58 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of synfinatic@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-ob0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Aug 2012 18:10:54 +0000 Received: by obbwc20 with SMTP id wc20so10010657obb.31 for ; Mon, 27 Aug 2012 11:10:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=YyRU5VFuVZN6/+iQXDOK4IDZmHKY3OfnepHWZT9z86M=; b=eQu2Tl0/6OSBDBWjvvFH8q2+gE3AdbfU0xZYMVrilhP18wPLBvzjwVlLTmeFFdjSSe vuIkGmn9tUabPtPhRwtYl8yedakszlwfVxMI+A9u/zkTIHeePlyP4wt3WgpK8N5NaQMd +JFYiuOYkDo37PB7TMom8JuBOKjpkuwWVsnRxLbrJb6mpqklunX82VAH+MmzAeFU2IFN fQcUvFyDTW0C3d5VkyI0D9mnv5W2ljrhMQUldWpjOt+9I3RcKicgYIdkuEC+/fsVTlpR FZM7OWmS45KOk6mgjdbnIaje5l7587Y0MMrM2btl4K1hw00/mXKz9Vb7yGasO8gIYMIL Cc1A== Received: by 10.182.217.38 with SMTP id ov6mr10460510obc.33.1346091033417; Mon, 27 Aug 2012 11:10:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.192.10 with HTTP; Mon, 27 Aug 2012 11:10:13 -0700 (PDT) In-Reply-To: <2653950A-E313-4176-8561-5926A34F6D46@thelastpickle.com> References: <2653950A-E313-4176-8561-5926A34F6D46@thelastpickle.com> From: Aaron Turner Date: Mon, 27 Aug 2012 11:10:13 -0700 Message-ID: Subject: Re: optimizing use of sstableloader / SSTableSimpleUnsortedWriter To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Mon, Aug 27, 2012 at 1:19 AM, aaron morton wrote: > After thinking about how > sstables are done on disk, it seems best (required??) to write out > each row at once. > > Sort of. We only want one instance of the row per SSTable created. Ah, good clarification, although I think for my purposes they're one in the same. > Any other tips to improve load time or reduce the load on the cluster > or subsequent compaction activity? > > Less SSTables means less compaction. So go as high as you can on the > bufferSizeInMB param for the > SSTableSimpleUnsortedWriter. Ok. > There is also a SSTableSimpleWriter. Because it expects rows to be ordered > it does not buffer and can create bigger sstables. > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/SSTableSimpleWriter.java Hmmm.... prolly not realistic in my situation... doing so would likely thrash the disks on my PG server a lot more and kill my read throughput and that server is already hitting a wall. > > Right now my Cassandra data store has about 4 months of data and we > have 5 years of historical > > ingest all the histories! Actually, I was a little worried about how much space that would take... my estimates was ~305GB/year, which is a lot when you consider the 300-400GB/node limit (something I didn't know about at the time). However, compression has turned out to be extremely efficient on my dataset... just under 4 months of data is less then 2GB! I'm pretty thrilled. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"