Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF3FA90DE for ; Wed, 30 Nov 2011 08:45:50 +0000 (UTC) Received: (qmail 72560 invoked by uid 500); 30 Nov 2011 08:45:48 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 72486 invoked by uid 500); 30 Nov 2011 08:45:45 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 72478 invoked by uid 99); 30 Nov 2011 08:45:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Nov 2011 08:45:44 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sylvain@datastax.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Nov 2011 08:45:39 +0000 Received: by ggnp4 with SMTP id p4so456913ggn.31 for ; Wed, 30 Nov 2011 00:45:18 -0800 (PST) Received: by 10.236.114.132 with SMTP id c4mr1705883yhh.104.1322642718369; Wed, 30 Nov 2011 00:45:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.236.23.138 with HTTP; Wed, 30 Nov 2011 00:44:54 -0800 (PST) In-Reply-To: <4ED5DD18.4060601@rightscale.com> References: <4ED5DD18.4060601@rightscale.com> From: Sylvain Lebresne Date: Wed, 30 Nov 2011 09:44:54 +0100 Message-ID: Subject: Re: trouble with deleted counter columns To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Wed, Nov 30, 2011 at 8:36 AM, Thorsten von Eicken w= rote: > Running a single 1.0.3 node and using counter columns I have a problem. > I have rows with ~200k counters. I deleted a number of such rows and now > I can't put counters back in, or really, I can't query what I put back in= . The reason is explained at http://wiki.apache.org/cassandra/Counters#Technical_limitations, though it wasn't clear that it was taking your situation into account (I've just updated it though). To rephrase, counters removal is only supported if definitive. You cannot increment after a deletion. Or rather, if you do, the behavior is undetermined. This holds for row deletion too; if you delete a row, you can't increment any counter that was there previously (the truth being that if you wait enough it would work, but how many is enough depends on things like when compaction happens and what is your gc_grace value). Note that I understand this could be a problem for your use case but that is an unfortunate limitation of the current design. > Example using the cli: > [default@rslog_production] get req_word_freq['20111124']; > Returned 0 results. > Elapsed time: 2089 msec(s). > [default@rslog_production] incr req_word_freq['20111124']['test']; > Value incremented. > [default@rslog_production] get req_word_freq['20111124']; > Returned 0 results. > Elapsed time: 2018 msec(s). > > Note how long it's taking, presumably because it's going through 200K+ > tombstones? That is likely the reason, yes. > > Here's the same using a fresh row key, note the timings: > [default@rslog_production] get req_word_freq['test']; > Returned 0 results. > Elapsed time: 1 msec(s). > [default@rslog_production] incr req_word_freq['test']['test']; > Value incremented. > [default@rslog_production] get req_word_freq['test']; > =3D> (counter=3Dtest, value=3D1) > Returned 1 results. > Elapsed time: 6 msec(s). > > Incidentally, I then tried out deleting the column and I don't > understand why the value is 2 at the end: > [default@rslog_production] del req_word_freq['test']; > row removed. > [default@rslog_production] get req_word_freq['test']; > Returned 0 results. > Elapsed time: 1 msec(s). > [default@rslog_production] incr req_word_freq['test']['test']; > Value incremented. > [default@rslog_production] get req_word_freq['test']; > =3D> (counter=3Dtest, value=3D2) > Returned 1 results. > Elapsed time: 1 msec(s). > > All this is on a single node system, running the cassandra-cli on the > system itself. The CF is as follows: > [default@rslog_production] describe req_word_freq; > =A0 =A0ColumnFamily: req_word_freq > =A0 =A0 =A0Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type > =A0 =A0 =A0Default column value validator: > org.apache.cassandra.db.marshal.CounterColumnType > =A0 =A0 =A0Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type > =A0 =A0 =A0Row cache size / save period in seconds / keys to save : 0.0/0= /all > =A0 =A0 =A0Row Cache Provider: > org.apache.cassandra.cache.SerializingCacheProvider > =A0 =A0 =A0Key cache size / save period in seconds: 200000.0/14400 > =A0 =A0 =A0GC grace seconds: 864000 > =A0 =A0 =A0Compaction min/max thresholds: 4/32 > =A0 =A0 =A0Read repair chance: 1.0 > =A0 =A0 =A0Replicate on write: true > =A0 =A0 =A0Built indexes: [] > =A0 =A0 =A0Compaction Strategy: > org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy > > I must be missing something... > Thorsten >