Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6427DF307 for ; Wed, 17 Jul 2013 01:53:12 +0000 (UTC) Received: (qmail 65161 invoked by uid 500); 17 Jul 2013 01:53:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 65068 invoked by uid 500); 17 Jul 2013 01:53:09 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 65060 invoked by uid 99); 17 Jul 2013 01:53:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2013 01:53:09 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: encountered temporary error during SPF processing of domain of huiqiangyang@yunrang.com) Received: from [101.227.4.42] (HELO mail.yunrang.com) (101.227.4.42) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2013 01:53:03 +0000 Received: from localhost (localhost [127.0.0.1]) by mail.yunrang.com (Postfix) with ESMTP id 7F0BB1241015 for ; Wed, 17 Jul 2013 09:52:20 +0800 (CST) Received: from mail.yunrang.com ([127.0.0.1]) by localhost (mail.yunrang.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 32ue6Wu2lCsz for ; Wed, 17 Jul 2013 09:52:16 +0800 (CST) Received: from localhost (localhost [127.0.0.1]) by mail.yunrang.com (Postfix) with ESMTP id 980B71241016 for ; Wed, 17 Jul 2013 09:52:16 +0800 (CST) X-Virus-Scanned: amavisd-new at yunrang.com Received: from mail.yunrang.com ([127.0.0.1]) by localhost (mail.yunrang.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id UHhuXBxhFrT0 for ; Wed, 17 Jul 2013 09:52:16 +0800 (CST) Received: from mail.yunrang.com (mail.yunrang.com [10.21.130.42]) by mail.yunrang.com (Postfix) with ESMTP id 7FAAB1241015 for ; Wed, 17 Jul 2013 09:52:16 +0800 (CST) Date: Wed, 17 Jul 2013 09:52:16 +0800 (CST) From: =?utf-8?B?5p2o6L6J5by6?= To: user@cassandra.apache.org Message-ID: <1956934430.2311161.1374025936454.JavaMail.root@yunrang.com> In-Reply-To: <05020E77-9443-452B-845C-5D735B307F9A@yahoo.com> References: <1673016557.2187899.1373967996540.JavaMail.root@yunrang.com> <51E517B5.3020608@opera.com> <1675796276.2253126.1373969712098.JavaMail.root@yunrang.com> <1758799163.2256853.1373970028901.JavaMail.root@yunrang.com> <05020E77-9443-452B-845C-5D735B307F9A@yahoo.com> Subject: Re: Deletion use more space. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [60.195.68.130] X-Mailer: Zimbra 8.0.3_GA_5664 (ZimbraWebClient - GC26 (Linux)/8.0.3_GA_5664) Thread-Topic: Deletion use more space. Thread-Index: 0NmpnyHwaHEjMLNIglz5ymA6Xt03tQ== X-Virus-Checked: Checked by ClamAV on apache.org Thanks, But Michael's answer confuse me more.=20 I use list cf; in cassandra-cli. It seems lots of rows have been deleted, b= ut keys exist. After the deletion, why the key still exists? It seems useless. RowKey: 3030303031306365633862356437636365303861303433343137656531306435 ------------------- RowKey: 3030303031316333616336366531613636373735396363323037396331613230 ------------------- RowKey: 3030303031316333616336366531613637303964616364363630663865313433 ------------------- RowKey: 3030303031323934613637303239323563633133303238626330646666626335 ------------------- RowKey: 3030303031323934613637303239323566303733303638373138366334323436 ------------------- RowKey: 3030303031333838333139303930633664643364613331316664363134656639 ------------------- RowKey: 3030303031336265343639303630613938376333366230363439316336333230 ------------------- RowKey: 3030303031336365653735376465616334633932333363363832653130363733 ------------------- RowKey: 3030303031343632343261363966376464656235373266663761633233353065 ----- =E5=8E=9F=E5=A7=8B=E9=82=AE=E4=BB=B6 ----- =E5=8F=91=E4=BB=B6=E4=BA=BA: "Michael Theroux" =E6=94=B6=E4=BB=B6=E4=BA=BA: user@cassandra.apache.org =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: =E6=98=9F=E6=9C=9F=E4=BA=8C, 2013=E5= =B9=B4 7 =E6=9C=88 16=E6=97=A5 =E4=B8=8B=E5=8D=88 10:23:32 =E4=B8=BB=E9=A2=98: Re: Deletion use more space. The only time information is removed from the filesystem is during compacti= on. Compaction can remove tombstones after gc_grace_seconds, which, could = result in reanimation of deleted data if the tombstone was never properly r= eplicated to other replicas. Repair will make sure tombstones are consiste= nt amongst replicas. However, tombstones can not be removed if the data th= e tombstone is deleting is in another SSTable and has not yet been removed.= =20 Hope this helps, -Mike =20 On Jul 16, 2013, at 10:04 AM, Andrew Bialecki wrote: > I don't think setting gc_grace_seconds to an hour is going to do what you= 'd expect. After gc_grace_seconds, if you haven't run a repair within that = hour, the data you deleted will seem to have been undeleted. >=20 > Someone correct me if I'm wrong, but in order to order to completely dele= te data and regain the space it takes up, you need to "delete" it, which cr= eates tombstones, and then run a repair on that column family within gc_gra= ce_seconds. After that the data is actually gone and the space reclaimed. >=20 >=20 > On Tue, Jul 16, 2013 at 6:20 AM, =E6=9D=A8=E8=BE=89=E5=BC=BA wrote: > Thank you! > It should be "update column family ScheduleInfoCF with gc_grace =3D 3600;= " > Faint. >=20 > ----- =E5=8E=9F=E5=A7=8B=E9=82=AE=E4=BB=B6 ----- > =E5=8F=91=E4=BB=B6=E4=BA=BA: "=E6=9D=A8=E8=BE=89=E5=BC=BA" > =E6=94=B6=E4=BB=B6=E4=BA=BA: user@cassandra.apache.org > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: =E6=98=9F=E6=9C=9F=E4=BA=8C, 2013= =E5=B9=B4 7 =E6=9C=88 16=E6=97=A5 =E4=B8=8B=E5=8D=88 6:15:12 > =E4=B8=BB=E9=A2=98: Re: Deletion use more space. >=20 > Hi, > I use the follow cmd to update gc_grace_seconds. It reports error! Why? >=20 > [default@WebSearch] update column family ScheduleInfoCF with gc_grace_sec= onds =3D 3600; > java.lang.IllegalArgumentException: No enum const class org.apache.cassan= dra.cli.CliClient$ColumnFamilyArgument.GC_GRACE_SECONDS >=20 >=20 > ----- =E5=8E=9F=E5=A7=8B=E9=82=AE=E4=BB=B6 ----- > =E5=8F=91=E4=BB=B6=E4=BA=BA: "Micha=C5=82 Michalski" > =E6=94=B6=E4=BB=B6=E4=BA=BA: user@cassandra.apache.org > =E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4: =E6=98=9F=E6=9C=9F=E4=BA=8C, 2013= =E5=B9=B4 7 =E6=9C=88 16=E6=97=A5 =E4=B8=8B=E5=8D=88 5:51:49 > =E4=B8=BB=E9=A2=98: Re: Deletion use more space. >=20 > Deletion is not really "removing" data, but it's adding tombstones > (markers) of deletion. They'll be later merged with existing data during > compaction and - in the end (see: gc_grace_seconds) - removed, but by > this time they'll take some space. >=20 > http://wiki.apache.org/cassandra/DistributedDeletes >=20 > M. >=20 > W dniu 16.07.2013 11:46, =E6=9D=A8=E8=BE=89=E5=BC=BA pisze: > > Hi, all: > > I use cassandra 1.2.4 and I have 4 nodes ring and use byte order par= titioner. > > I had inserted about 200G data in the ring previous days. > > > > Today I write a program to scan the ring and then at the same time d= elete the items that are scanned. > > To my surprise, the cassandra cost more disk usage. > > > > Anybody can tell me why? Thanks. > > >=20