Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 68954 invoked from network); 30 Mar 2010 05:54:49 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Mar 2010 05:54:49 -0000 Received: (qmail 47179 invoked by uid 500); 30 Mar 2010 05:54:48 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 46988 invoked by uid 500); 30 Mar 2010 05:54:48 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 46980 invoked by uid 99); 30 Mar 2010 05:54:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Mar 2010 05:54:47 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.92.26] (HELO qw-out-2122.google.com) (74.125.92.26) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Mar 2010 05:54:40 +0000 Received: by qw-out-2122.google.com with SMTP id 8so3736047qwh.61 for ; Mon, 29 Mar 2010 22:54:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.222.140 with HTTP; Mon, 29 Mar 2010 22:54:18 -0700 (PDT) Date: Tue, 30 Mar 2010 16:54:18 +1100 Received: by 10.229.235.1 with SMTP id ke1mr1698165qcb.3.1269928458968; Mon, 29 Mar 2010 22:54:18 -0700 (PDT) Message-ID: Subject: Large data files and no "edit in place"? From: Julian Simon To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Forgive me as I'm probably a little out of my depth in trying to assess this particular design choice within Cassandra, but... My understanding is that Cassandra never updates data "in place" on disk - instead it completely re-creates the data files during a "flush". Stop me if I'm wrong already ;-) So imagine we have a large data set in our ColumnFamily and we're constantly adding data to it. Every [x] minutes or [y] bytes, the compaction process is triggered, and the entire data set is written to disk. So as our data set grows over time, the compaction process will result in an increasingly large IO operation to write all that data to disk each time. We could easily be talking about single data files in the many-gigabyte size range, no? Or is there a file size limit that I'm not aware of? If not, is this an efficient approach to take for large data sets? Seems like we would become awfully IO bound, writing the entire thing from scratch each time. Do let me know if I've gotten it all wrong ;-) Cheers, Jules