Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 9849 invoked from network); 3 Feb 2011 15:46:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Feb 2011 15:46:25 -0000 Received: (qmail 24437 invoked by uid 500); 3 Feb 2011 15:46:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 24047 invoked by uid 500); 3 Feb 2011 15:46:19 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 24038 invoked by uid 99); 3 Feb 2011 15:46:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Feb 2011 15:46:18 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of omerhj@gmail.com designates 209.85.161.44 as permitted sender) Received: from [209.85.161.44] (HELO mail-fx0-f44.google.com) (209.85.161.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Feb 2011 15:46:10 +0000 Received: by fxm9 with SMTP id 9so1410346fxm.31 for ; Thu, 03 Feb 2011 07:45:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=A9aYB9iwWDg2ny9BX826grXuPG5MH/l+aEErueAOtCs=; b=P93KNJOc9XMY1ieQKVht+0spruaWxygDvqmlVIF7lQIJDpIE7eYIAS0pB+CW8OG7qt jO1pMc93zVFeESaw/NQXHCSrkBHohjEVodtAOBmBnN5QwN+wfI9TpuRCsHQlRE7hpelY Y/Pb6BsFO4NFuQcUG2MMl4Wa8qtZZ9VJVi6lM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=CWbN45yRto5CGmL3+IhF4xUS87gX2Uujw/bg3VswtiAFPiLlikQYPVtTsGVrK0aehK 690mkJYeLk0NHJV0PmEMAG5XENS7LID4Ik14IJUdh/+fs8nnNv7WPxiWvnnlxyfLUTJ9 NEbHnZVcuhH+HqClhYgYUeJIMqPj/pCQhzrN4= MIME-Version: 1.0 Received: by 10.223.103.8 with SMTP id i8mr1515171fao.47.1296747950465; Thu, 03 Feb 2011 07:45:50 -0800 (PST) Received: by 10.223.85.198 with HTTP; Thu, 3 Feb 2011 07:45:50 -0800 (PST) Date: Thu, 3 Feb 2011 10:45:50 -0500 Message-ID: Subject: Mitigating CASSANDRA-2059 -- leftover files From: Omer van der Horst Jansen To: user Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Jonathan pointed out in another thread that it looks like I'm running into CASSANDRA-2059, where secondary files are not being properly deleted. My production data set at any given time is less than 100 MB in size, but the Cassandra data directories on each instance are using 30 to 40 times as much space right now, and steadily growing. I understand I can remove the root cause of the problem by applying the patch that's attached to the bug report or by upgrading to =A00.7.1 when it's out. In the meantime, is it safe to manually delete stale files while Cassandra is running? =A0And how do I determine when a set of files is stale? I'd assume that a given set of files is deletable if there is no -Data.db file and the -Compacted file has zero length. Example of what I would think is a set of stale files, without a -Data,db f= ile: ls -l *3090* -rw-rw-r-- 1 user group =A0 =A00 Feb =A03 10:00 Payload-e-3090-Compacted -rw-rw-r-- 1 user group =A0245 Feb =A03 10:00 Payload-e-3090-Filter.db -rw-rw-r-- 1 user group 4362 Feb =A03 10:00 Payload-e-3090-Index.db -rw-rw-r-- 1 user group 4840 Feb =A03 10:00 Payload-e-3090-Statistics.db I've got these all the way back to =A0Payload-e-1-Index.db. Non-stale files: ls -l *3095* -rw-rw-r-- 1 user group =A0 =A0 =A0 =A00 Feb =A03 10:35 Payload-e-3095-Comp= acted -rw-rw-r-- 1 user group 41269735 Feb =A03 10:14 Payload-e-3095-Data.db -rw-rw-r-- 1 user group =A0 286405 Feb =A03 10:14 Payload-e-3095-Filter.db -rw-rw-r-- 1 user group =A07608022 Feb =A03 10:14 Payload-e-3095-Index.db -rw-rw-r-- 1 user group =A0 =A0 4840 Feb =A03 10:14 Payload-e-3095-Statisti= cs.db There is an active Data.db file, so I'd leave this group alone. --Omer