Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8C2E161A7 for ; Wed, 22 Jun 2011 17:11:45 +0000 (UTC) Received: (qmail 96312 invoked by uid 500); 22 Jun 2011 17:11:43 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 96276 invoked by uid 500); 22 Jun 2011 17:11:43 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 96268 invoked by uid 99); 22 Jun 2011 17:11:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jun 2011 17:11:43 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of etamme@gmail.com designates 209.85.161.44 as permitted sender) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Jun 2011 17:11:35 +0000 Received: by fxm15 with SMTP id 15so865790fxm.31 for ; Wed, 22 Jun 2011 10:11:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=ke4/R3WKVtN6rcLNt9juNty7d7ZvE7F740wRg2mjalM=; b=f1FtwarXsiTr9G1Pi9cv1rulF5xOMN9b5MLXuZNepPatgoEs4zF/Za8A2OMgjK4J/Z crsqW9B0kMmiUAm+9SUN4cJNkEOSPgLUoRSNosFJB2RSvkybMFigDr99yCLWq5I04/CE nU59aBExcxhCPFjeIT2Atr6BWPItJx99tIo2Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=iBXSHJZmauHi4Cvv4tRtpdFuKcsAUhr/Tec/jmQMjS3r0udRL1lz78K+fLEwmiuVaq tIJyzc0SS9zbqbS0b8upLW9++0oimRWCRo2IkPBwmfz6L4hOci0YME6y9+J2EAsa07FF yxvOwxrPCpC3t4y9mYGX+5hvco1Zy7zEGWyRc= MIME-Version: 1.0 Received: by 10.223.51.4 with SMTP id b4mr1116641fag.93.1308762675498; Wed, 22 Jun 2011 10:11:15 -0700 (PDT) Received: by 10.223.78.137 with HTTP; Wed, 22 Jun 2011 10:11:15 -0700 (PDT) In-Reply-To: <16FEB70B-119E-4377-B993-8D8F39712444@gmail.com> References: <16FEB70B-119E-4377-B993-8D8F39712444@gmail.com> Date: Wed, 22 Jun 2011 13:11:15 -0400 Message-ID: Subject: Re: simple question about merged SSTable sizes From: Eric tamme To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org >> Second, compacting such large files is an IO killer. =A0 =A0What can be = tuned >> other than compaction_threshold to help optimize this and prevent the fi= les >> from getting too big? >> >> Thanks! > > Just a personal implementation note - I make heavy use of column TTL, so I have very specifically tuned cassandra to having a pretty constant max disk usage based on my data insertion rate, the TTL, the memtable flush threshold, and min compaction threshold. My data basically lives for 7 days and depending on where it is in the compaction cycle goes from 130 gigs per node up to 160gigs per node. If setting TTL is an option for you, It is one way to auto purge data and keep overall size in check. -Eric