Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2ECF518714 for ; Tue, 22 Dec 2015 09:36:03 +0000 (UTC) Received: (qmail 74279 invoked by uid 500); 22 Dec 2015 09:36:00 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 74240 invoked by uid 500); 22 Dec 2015 09:35:59 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74230 invoked by uid 99); 22 Dec 2015 09:35:59 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2015 09:35:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 4958F18051A for ; Tue, 22 Dec 2015 09:35:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.879 X-Spam-Level: ** X-Spam-Status: No, score=2.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id ecz4EfByTGAV for ; Tue, 22 Dec 2015 09:35:58 +0000 (UTC) Received: from mail-yk0-f193.google.com (mail-yk0-f193.google.com [209.85.160.193]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id C805842BE6 for ; Tue, 22 Dec 2015 09:35:57 +0000 (UTC) Received: by mail-yk0-f193.google.com with SMTP id v6so14701556ykc.3 for ; Tue, 22 Dec 2015 01:35:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=SWJRjwZyg1mSYgRh0lumdb+VWBxfkhpzUSxziR8aCtc=; b=wYVX5zyJ04m4kqfCkfwds9S000XU/JBj3BpIdsg9jReTUCgyJKSHSryq08oy79vJ2d lu0a8T5a+/bFBvLKrddYDYhb8sXhrFXJe6JyxU62TbVECf/obuqXsIe2he0OqAy25AQc NWFPG/LemMISSPvrXYBYvcF+uB2K9sFjiNBsAd7gxQLmSrPdyOPyMPWFnb91lgsk4LTm HqkAbyyvOExtWlm+gKq3xWgpPaHKgMYDh3ig2VJRBIRQG5xbTLvuR0RzUEZ/4LMQ/eTj TOvJHKTgpSHYUHXLqcMX51l8DOpI5Uf8x7c3ki9lsIJhr7H6Go4avj549K6oEC75JrWa JLfg== MIME-Version: 1.0 X-Received: by 10.129.129.130 with SMTP id r124mr20305402ywf.242.1450776951359; Tue, 22 Dec 2015 01:35:51 -0800 (PST) Received: by 10.129.147.194 with HTTP; Tue, 22 Dec 2015 01:35:51 -0800 (PST) Date: Tue, 22 Dec 2015 15:05:51 +0530 Message-ID: Subject: Using TTL for data purge From: "jaalex.tech" To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=94eb2c055098d5919b05277954cb --94eb2c055098d5919b05277954cb Content-Type: text/plain; charset=UTF-8 Hi, I'm looking for suggestions/caveats on using TTL as a subsitute for a manual data purge job. We have few tables that hold user information - this could be guest or registered users, and there could be between 500K to 1M records created per day per table. Currently, these tables have a secondary indexed updated_date column which is populated on each update. However, we have been getting timeouts when running queries using updated_date when the number of records are high, so i don't think this would be a reliable option in the long term when we need to purge records that have not been used for the last X days. In this scenario, is it advisable to include a high enough TTL (i.e the amount of time we want these to last, could be 3 to 6 months) when inserting/updating records? There could be cases where the TTL may get reset after couple of days/weeks, when the user visits the site again. The tables have fixed number of columns, except for one which has a clustering key, and may have max 10 entries per partition key. I need to know the overhead of having so many rows with TTL hanging around for a relatively longer duration (weeks/months), and the impacts it could have on performance/storage. If this is not a recommended approach, what would be an alternate design which could be used for a manual purge job, without using secondary indices. We are using Cassandra 2.0.x. Thanks, Joseph --94eb2c055098d5919b05277954cb Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I'm looking for suggestions/cav= eats on using TTL as a subsitute for a manual data purge job.=C2=A0

We have few tables that hold user information - this coul= d be guest or registered users, and there could be between 500K to 1M recor= ds created per day per table. Currently, these tables have a secondary inde= xed updated_date column which is populated on each update. However, we have= been getting timeouts when running queries using updated_date when the num= ber of records are high, so i don't think this would be a reliable opti= on in the long term when we need to purge records that have not been used f= or the last X days.=C2=A0

In this scenario, is it = advisable to include a high enough TTL (i.e the amount of time we want thes= e to last, could be 3 to 6 months) when inserting/updating records?=C2=A0

There could be cases where the TTL may get reset af= ter couple of days/weeks, when the user visits the site again.
The tables have fixed number of columns, except for one which = has a clustering key, and may have max 10 entries per =C2=A0partition key.<= /div>

I need to know the overhead of having so many rows= with TTL hanging around for a relatively longer duration (weeks/months), a= nd the impacts it could have on performance/storage. If this is not a recom= mended approach, what would be an alternate design which could be used for = a manual purge job, without using secondary indices.

We are using=C2=A0Cassandra 2.0.x.

Thanks,
Joseph

--94eb2c055098d5919b05277954cb--