Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 488864C76 for ; Sun, 29 May 2011 07:24:48 +0000 (UTC) Received: (qmail 98143 invoked by uid 500); 29 May 2011 07:24:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 98128 invoked by uid 500); 29 May 2011 07:24:44 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 98120 invoked by uid 99); 29 May 2011 07:24:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 May 2011 07:24:43 +0000 X-ASF-Spam-Status: No, hits=4.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HK_RANDOM_ENVFROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of teddyyyy123@gmail.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 29 May 2011 07:24:38 +0000 Received: by gxk19 with SMTP id 19so1627191gxk.31 for ; Sun, 29 May 2011 00:24:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=BUNyF9pNsCRdtCgrX+GYPk9TsvpDdjjCk35V80Mko2M=; b=tLrhjaw+lz03/QcOmw50qxQyTjI87PaLUd6FPg6al5vSS96K9Q4OVZWj3Wg8OZdAUD sGzNtmsU/gVb3BwhBxt2WFmWthY6zO/zCXtMZnJKb9J6c7xedn7SoJoW3WOEzg9ig5b3 mfldsFkFdDXuNNgaprypVqozY8QdUZWCuK63E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=yDBY0Db6pQO0CJ4KoW2kSeeYPrwLmh06ixhlq0fccNIbG7xK7cBPS16Q7YTZUzWFXk tLhT8RT11uY9vtXDEAgMovQIL3fZisV1hmPRFra5IsFfzXyG/07NGScbYW/ZSp1KBjaY jGaGsAzkFzHPNhvnp03CME4tYX8VDusnfjhdY= MIME-Version: 1.0 Received: by 10.236.154.1 with SMTP id g1mr4543982yhk.112.1306653856618; Sun, 29 May 2011 00:24:16 -0700 (PDT) Received: by 10.236.199.72 with HTTP; Sun, 29 May 2011 00:24:16 -0700 (PDT) In-Reply-To: References: Date: Sun, 29 May 2011 00:24:16 -0700 Message-ID: Subject: Re: expiring + counter column? From: Yang To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf302d49c6f7d96b04a4650e07 --20cf302d49c6f7d96b04a4650e07 Content-Type: text/plain; charset=ISO-8859-1 sorry to beat on the dead horse. I looked at the link referred from #2103 : https://issues.apache.org/jira/browse/CASSANDRA-2101 I agree with the reasoning in #2101 that the ultimate issue is that delete and counter adds are not commutative. since by definition we can't achieve predictable behavior with deletes + counter, can we redefine the behavior of counter deletes, so that we can always guarantee the declared behavior? --- specifically: *we define that once a counter column is deleted, you can never add to it again.* attempts to add to a dead counter throws an exception ---- all future adds are just ignored. i.e. a counter column has only one life, until all tombstones are purged from system, after which it is possible for the counter to have a new incarnation. basically instead of solving the problem raised in #2103, we declare openly that it's unsolvable (which is true), and make the code reflect this fact. I think this behavior would satisfy most use cases of counters. so instead of relying on the advice to developers: "do not do updates for a period after deletes, otherwise it probably wont' work", we enforce this into the code. the same logic can be carried over into expiring column, since they are essentially automatically inserted deletes. that way #2103 could be "solved" I'm attaching an example below, you can refer to them if needed. Thanks a lot Yang example: for simplicity we assume there is only one column family , one column, so we omit column name and cf name in our notation, assume all counterColumns have a delta value of 1, we only mark their ttl now. so c(123) means a counter column of ttl=1, adding a delta of 1. d(456) means a tombstone with ttl=456. then we can have the following operations operation result after operation ---------------------------------------------------------------------- c(1) count=1 d(2) count = null ( counter not present ) c(3) count = null ( add on dead counter ignored) --------------------------------------------------- if the 2 adds arrive out of order , we would still guarantee eventual consistency: operation result after operation -------------------------------------------------------------------------------- c(1) count=1 c(3) count=2 (we have 2 adds, each with delta=1) d(2) count=null (deleted) -------------------------------------------------------------- at the end of both scenarios, the result is guaranteed to be null; note that in the second scenario, line 2 shows a snapshot where we have a state with count=2, which scenario 1 never sees this. this is fine, since even regular columns can have this situation (just consider if the counter columns were inserts/overwrites instead ) On Fri, May 27, 2011 at 5:57 PM, Jonathan Ellis wrote: > No. See comments to https://issues.apache.org/jira/browse/CASSANDRA-2103 > > On Fri, May 27, 2011 at 7:29 PM, Yang wrote: >> is this combination feature available , or on track ? >> >> thanks >> Yang >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > --20cf302d49c6f7d96b04a4650e07 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable sorry to beat on the dead horse.

I looked at the link referred from = #2103 : ht= tps://issues.apache.org/jira/browse/CASSANDRA-2101
I agree with the = reasoning in #2101 that the ultimate issue is that delete and counter adds = are not commutative. since by definition we can't achieve
predictable behavior with deletes + counter, can we redefine the behavior o= f counter deletes, so that we can always guarantee the declared behavior? -= -- specifically:


we define that once a counter column is dele= ted, you can never add to it again.=A0 attempts to add to a dead counte= r throws an exception =A0---- all future adds are just ignored. =A0i.e. a c= ounter column has only one life, until all tombstones are purged from syste= m, after which it is possible for the counter =A0to have a new incarnation.= =A0basically instead of solving the problem raised in #2103, we declare op= enly that it's unsolvable (which is true), and make the code reflect th= is fact.



I think this behavior would satisfy most use cases of counters.= so instead of relying on the advice to developers: "do not do updates= for a period after deletes, otherwise it probably wont' work", we= enforce this into the code.=A0


the same logic can be carried over into expir= ing column, since they are essentially automatically inserted deletes. that= way #2103 could be "solved"


I'm attaching an example below, you can refer to them if needed.

Thanks =A0a lot
Yang

=

example:
for simplicity we assume there is on= ly one column family , one column, so we omit column name and cf name in ou= r notation, assume all counterColumns have a delta value of 1, we only mark= their ttl now. so c(123) means a counter column of ttl=3D1, adding a delta= of 1. d(456) means a tombstone with ttl=3D456.=A0

then we can have the following operations
operation =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0result after o= peration
--------------------------------------------------------= --------------
c(1) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0count=3D1<= /div>
d(2) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0count= =3D null ( counter not present ) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0
c(3) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0count = =3D null ( add on dead counter ignored)
---------------------------------------------------


if the 2 adds arrive out of order , =A0we would still= guarantee eventual consistency:

operation =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0result after operation
----------------------------------------------------------------------= ----------
c(1) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0count=3D1
c(3) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0count=3D2 =A0 (we have 2 adds, each with delta=3D1)
d(2) =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0count=3Dnu= ll (deleted)
----------------------------------------------------= ----------
at the end of both scenarios, the result is guaranteed= to be null;
note that in the second scenario, line 2 shows a sna= pshot where we have a state with count=3D2, which scenario 1 never sees thi= s. this is fine, since even regular columns can have this situation (just c= onsider if the counter columns were inserts/overwrites instead )



On Fri, May 27, 2011 at 5:57 PM, Jon= athan Ellis <jbellis@gmail.com&= gt; wrote:
> No. See comments to https://issues.apache.org/jira/browse/CASSANDR= A-2103
>
> On Fri, May 27, 2011 at 7:29 PM, Yang <teddyyyy123@gmail.com> wrote:
>> is th= is combination feature available , or on track ?
>>
>> th= anks
>> Yang
>>
>
>
>
> --
> Jonath= an Ellis
> Project Chair, Apache Cassandra
> co-founder of Data= Stax, the source for professional Cassandra support
> http://www.datastax.com
>

--20cf302d49c6f7d96b04a4650e07--