From user-return-59387-archive-asf-public=cust-asf.ponee.io@cassandra.apache.org Fri Jan 12 17:29:01 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 48B23180621 for ; Fri, 12 Jan 2018 17:29:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 38C8C160C33; Fri, 12 Jan 2018 16:29:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 571F1160C30 for ; Fri, 12 Jan 2018 17:29:00 +0100 (CET) Received: (qmail 67818 invoked by uid 500); 12 Jan 2018 16:28:58 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 67808 invoked by uid 99); 12 Jan 2018 16:28:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Jan 2018 16:28:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id DB7FC1A1044 for ; Fri, 12 Jan 2018 16:28:57 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id eAu-C_n5lTru for ; Fri, 12 Jan 2018 16:28:55 +0000 (UTC) Received: from mail-vk0-f51.google.com (mail-vk0-f51.google.com [209.85.213.51]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 5CC235F39D for ; Fri, 12 Jan 2018 16:28:55 +0000 (UTC) Received: by mail-vk0-f51.google.com with SMTP id w22so3929373vkw.0 for ; Fri, 12 Jan 2018 08:28:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=T0g96h/zPPTWQ222v4r3VKj2bHHSo7nDaAkcDxFrfvg=; b=aL8LXTp7+0TlH4he7txzzfpF39IrHk4NunLt6HMqw/bsZ7Ighw8dUd1oYIblzoCojU Tj/XhXVCnDi3SSlfipxTPMsG9Qi6ebuGBa8fj2lI0qt6f4GEsF4EQ7fY660sPSIOS6Pm Cuhr89w7A0UA8d4fFGcSMvk5D14LhWSLL9JHfXPIULBmP484zoyWVsP0k7K+NYaBdtX7 wPxSehApBDoZTPmmSYRe6q2qF64lwsx3tsncNpRIq/iGYefy9TSRWS9FABYZzoZecI7K llDsUfbc6XWLoV9Eo2xxDi2zYjpCqyg0mP9qQcHdVrzSbPtCZxZtYz+asaZeS71puvdF TgoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=T0g96h/zPPTWQ222v4r3VKj2bHHSo7nDaAkcDxFrfvg=; b=pTS0eS1IkL/afbkWJS6kR2ZnbK+iWbg1NDnvIfSmK7JkhKtv2BVxXsP+MNbT6SC1or cn+ccvFUo/wWpNjyDKkZwxedGMGVrbAYM+YeztChi3yfTpA3oId8rkgTp0DSJVFnKKbC 31FNVz4MemdIIB9l1XXY5grY7MrqML7EDtJclNzSEhQI2CEWqIvhDzW1g+latpYXMZ9z 7YGQ9RAG0i7uT2C5VXX/at35y5HuUh7AKMSK5KdNgObfGn7aexezKbWEMyhPUrRgitt0 fQ1PtIzB97d4ppASLiZP2SnxPK1CNEajMbtRLaXb4T3sH8Oidfn7xHcguIynyttb3fUO 2StQ== X-Gm-Message-State: AKwxyteuRe4qeF+JaPIoYxayOxGdOeLQQRP9CQNp3arc/c/rJxAOvT6z lzy5YLI/DeVU1q1ZS8gUowK1pTEPRkaIvELKUZcULg== X-Google-Smtp-Source: ACJfBou+5jPpS635hwsMWuoasLwP7B3azKdi7d3cQIva9SVKhnS0kT70sHXBjO/NxXbeMFYUPENDB6koQ11hmADJUXM= X-Received: by 10.31.88.65 with SMTP id m62mr23153580vkb.14.1515774534425; Fri, 12 Jan 2018 08:28:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.176.3.194 with HTTP; Fri, 12 Jan 2018 08:28:53 -0800 (PST) In-Reply-To: References: From: Python_Max Date: Fri, 12 Jan 2018 18:28:53 +0200 Message-ID: Subject: Re: Too many tombstones using TTL To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="94eb2c19e71eaf21aa056296c271" --94eb2c19e71eaf21aa056296c271 Content-Type: text/plain; charset="UTF-8" Thank you for response. I know about the option of setting TTL per column or even per item in collection. However in my example entire row has expired, shouldn't Cassandra be able to detect this situation and spawn a single tombstone for entire row instead of many? Is there any reason not doing this except that no one needs it? Is this suitable for feature request or improvement? Thanks. On Wed, Jan 10, 2018 at 4:52 PM, DuyHai Doan wrote: > "The question is why Cassandra creates a tombstone for every column > instead of single tombstone per row?" > > --> Simply because technically it is possible to set different TTL value > on each column of a CQL row > > On Wed, Jan 10, 2018 at 2:59 PM, Python_Max wrote: > >> Hello, C* users and experts. >> >> I have (one more) question about tombstones. >> >> Consider the following example: >> cqlsh> create keyspace test_ttl with replication = {'class': >> 'SimpleStrategy', 'replication_factor': '1'}; use test_ttl; >> cqlsh> create table items(a text, b text, c1 text, c2 text, c3 text, >> primary key (a, b)); >> cqlsh> insert into items(a,b,c1,c2,c3) values('AAA', 'BBB', 'C111', >> 'C222', 'C333') using ttl 60; >> bash$ nodetool flush >> bash$ sleep 60 >> bash$ nodetool compact test_ttl items >> bash$ sstabledump mc-2-big-Data.db >> >> [ >> { >> "partition" : { >> "key" : [ "AAA" ], >> "position" : 0 >> }, >> "rows" : [ >> { >> "type" : "row", >> "position" : 58, >> "clustering" : [ "BBB" ], >> "liveness_info" : { "tstamp" : "2018-01-10T13:29:25.777Z", "ttl" >> : 60, "expires_at" : "2018-01-10T13:30:25Z", "expired" : true }, >> "cells" : [ >> { "name" : "c1", "deletion_info" : { "local_delete_time" : >> "2018-01-10T13:29:25Z" } >> }, >> { "name" : "c2", "deletion_info" : { "local_delete_time" : >> "2018-01-10T13:29:25Z" } >> }, >> { "name" : "c3", "deletion_info" : { "local_delete_time" : >> "2018-01-10T13:29:25Z" } >> } >> ] >> } >> ] >> } >> ] >> >> The question is why Cassandra creates a tombstone for every column >> instead of single tombstone per row? >> >> In production environment I have a table with ~30 columns and It gives me >> a warning for 30k tombstones and 300 live rows. It is 30 times more then it >> could be. >> Can this behavior be tuned in some way? >> >> Thanks. >> >> -- >> Best regards, >> Python_Max. >> > > -- Best regards, Python_Max. --94eb2c19e71eaf21aa056296c271 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you for response.

I know about th= e option of setting TTL per column or even per item in collection. However = in my example entire row has expired, shouldn't Cassandra be able to de= tect this situation and spawn a single tombstone for entire row instead of = many?
Is there any reason not doing this except that no one needs= it? Is this suitable for feature request or improvement?

Thanks.

On Wed, Jan 10, 2018 at 4:52 PM, DuyHai Doan &l= t;doanduyhai@gmai= l.com> wrote:
"The question is= why Cassandra creates a tombstone for every column instead of single tombs= tone per row?"=C2=A0

<= /span>
--> Simply beca= use technically it is possible to set different TTL value on each column of= a CQL row

On Wed, Jan 10, 2018 at= 2:59 PM, Python_Max <python.max@gmail.com> wrote:
Hello, C* users and experts.
I have (one more) question about tombstones.
Consider the following example:
cqlsh> create= keyspace test_ttl with replication =3D {'class': 'SimpleStrate= gy', 'replication_factor': '1'}; use test_ttl;
cqlsh>=C2=A0create table items(a text, b text, c1 text, c2 text, c= 3 text, primary key (a, b));
cqlsh>=C2=A0insert into items(a,b= ,c1,c2,c3) values('AAA', 'BBB', 'C111', 'C222&#= 39;, 'C333') using ttl 60;
bash$ nodetool flush
bash$ sleep 60
bash$ nodetool=C2=A0compact test_ttl items
<= div>bash$ sstabledump=C2=A0mc-2-big-Data.db

[
=C2=A0 {
=C2=A0 =C2=A0 "partition&quo= t; : {
=C2=A0 =C2=A0 =C2=A0 "key" : [ "AAA" ]= ,
=C2=A0 =C2=A0 =C2=A0 "position" : 0
=C2=A0 = =C2=A0 },
=C2=A0 =C2=A0 "rows" : [
=C2=A0 =C2= =A0 =C2=A0 {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "type" : "= ;row",
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "position" : 58= ,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "clustering" : [ "BB= B" ],
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "liveness_info" = : { "tstamp" : "2018-01-10T13:29:25.777Z", "ttl&qu= ot; : 60, "expires_at" : "2018-01-10T13:30:25Z", "= expired" : true },
=C2=A0 =C2=A0 =C2=A0 =C2=A0 "cells&q= uot; : [
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { "name" : = "c1", "deletion_info" : { "local_delete_time"= : "2018-01-10T13:29:25Z" }
=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 },
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { "name" = : "c2", "deletion_info" : { "local_delete_time&quo= t; : "2018-01-10T13:29:25Z" }
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 },
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { "name&quo= t; : "c3", "deletion_info" : { "local_delete_time&= quot; : "2018-01-10T13:29:25Z" }
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ]
=C2=A0 = =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 ]
=C2=A0 }
]

The question is why Cassandra creates a tombst= one for every column instead of single tombstone per row?

In production environment I have a table with ~30 columns and It gi= ves me a warning for 30k tombstones and 300 live rows. It is 30 times more = then it could be.
Can this behavior be tuned in some way?

Thanks.

--
Best regards,
=
Python_Max.




--
=
Best= regards,
Python_Max.
--94eb2c19e71eaf21aa056296c271--