Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5C570200BBD for ; Tue, 8 Nov 2016 11:11:02 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 5AE59160B0A; Tue, 8 Nov 2016 10:11:02 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7A840160AFA for ; Tue, 8 Nov 2016 11:11:01 +0100 (CET) Received: (qmail 31209 invoked by uid 500); 8 Nov 2016 10:10:59 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 31199 invoked by uid 99); 8 Nov 2016 10:10:59 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Nov 2016 10:10:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 6071E183994 for ; Tue, 8 Nov 2016 10:10:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.629 X-Spam-Level: ** X-Spam-Status: No, score=2.629 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id FHZAnPz0KMoz for ; Tue, 8 Nov 2016 10:10:58 +0000 (UTC) Received: from mail-wm0-f48.google.com (mail-wm0-f48.google.com [74.125.82.48]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 8A5365F1F3 for ; Tue, 8 Nov 2016 10:10:57 +0000 (UTC) Received: by mail-wm0-f48.google.com with SMTP id f82so170362613wmf.1 for ; Tue, 08 Nov 2016 02:10:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=ZmlAd/buLRU3+TLOHWTmFBhnOMFqfIxZykN2ig+6jE4=; b=vkWGzJEi/W3jhS3EZlxEexKK2PDljvPuOKq6IraCeDE97o3c7xt9DMUQlmFDyyEcPB KGVa9IsKqDwBEFxstASw5FdSW1kIvZw4KpekSjhemmJYOxAwHDnt2BQMymiRdXt2fZHB LlQFWkKtS1ajPtWZEHVyvHulOm+gEj/pi5wj5R9U00Oa3lPOBqxB7S9IfsvUri9/1EaE uCH/SoSkEPa9IH2Sk4AgSOD1QViiu13+R0IzMsVLBtEMCUOoN9s5zpkPx7Z9Z9FQ9EAZ ezfDgzZH7UjypKJQ2XHYqSM/wFe4hH6wk+w/1AQOvTKYBxZfQE6oCREqXIqcx98ZdzDk 5/SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=ZmlAd/buLRU3+TLOHWTmFBhnOMFqfIxZykN2ig+6jE4=; b=IWrCPSd35rP4NwnEybKU7LK10REZAMemTscY86WbU9KIeuADFiBrhWJYuLgS6ttmxS 22gw143GELPdQwiMRJvVfYUNcDBryBYm8ECTPFivLaY7hLGNKfuY4o7DcWG06lemgABU 7W7sdWaJxJk2gFWzxP997oIlHYnocrRLQTjuSYktgS2c7sMa3qW4JYfxuKuHVn+vLL8h 9dzGGF7hvskaNvRYhlzpm8nNe2LJx8j8f7moMF0AV39aFcDP6L2rwNROjJpLFMA6v6jk ZbvtPhvyXg4eZESXxzUf6/050aCN8QO5hKB7mPs/HpBAi2DfOe1aFHyW6lh7UwGcizBS WvvQ== X-Gm-Message-State: ABUngve/MkNkyZwHHGpYEdazqKibKkgDIvGOdjA/WGVsA3lOBYvWd3Y7X7BmlhdNwH8RwD4n00yipT0BbojVUA== X-Received: by 10.194.52.8 with SMTP id p8mr9322470wjo.38.1478599856963; Tue, 08 Nov 2016 02:10:56 -0800 (PST) MIME-Version: 1.0 Received: by 10.28.50.197 with HTTP; Tue, 8 Nov 2016 02:10:56 -0800 (PST) In-Reply-To: References: From: Ali Akhtar Date: Tue, 8 Nov 2016 15:10:56 +0500 Message-ID: Subject: Re: Improving performance where a lot of updates and deletes are required? To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b873f803d48470540c75b58 archived-at: Tue, 08 Nov 2016 10:11:02 -0000 --047d7b873f803d48470540c75b58 Content-Type: text/plain; charset=UTF-8 Yes, because there will also be a lot of inserts, and the linear scalability that c* offers is required. But the inserts aren't static, and the data that comes in will need to be updated in response to user events. Data which hasn't been touched for over a week has to be deleted. (Sensitive data, so better to delete when its out of date rather than store it). Couldn't really do the weekly tables without massively complicating my report generation, as the entire dataset needs to be queried for generating certain reports. So my question is really about how to get the best out of c* in this sort of scenario. On Tue, Nov 8, 2016 at 3:05 PM, DuyHai Doan wrote: > Are you sure Cassandra is a good fit for this kind of heavy update & > delete scenario ? > > Otherwise, you can always use several tables (one table/day, rotating > through 7 days for a week) and do a truncate of the table at the end of the > day. > > On Tue, Nov 8, 2016 at 11:04 AM, Ali Akhtar wrote: > >> I have a use case where a lot of updates and deletes to a table will be >> necessary. >> >> The deletes will be done at a scheduled time, probably at the end of the >> day, each day. >> >> Updates will be done throughout the day, as new data comes in. >> >> Are there any guidelines on improving cassandra's performance for this >> use case? Any caveats to be aware of? Any tips, like running nodetool >> repair every X days? >> >> Thanks. >> > > --047d7b873f803d48470540c75b58 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Yes, because there will also be a lot of inserts, and the = linear scalability that c* offers is required.

But the i= nserts aren't static, and the data that comes in will need to be update= d in response to user events.

Data which hasn'= t been touched for over a week has to be deleted. (Sensitive data, so bette= r to delete when its out of date rather than store it).

Couldn't really do the weekly tables without massively complicati= ng my report generation, as the entire dataset needs to be queried for gene= rating certain reports.=C2=A0

So my question is re= ally about how to get the best out of c* in this sort of scenario.

On Tue, Nov 8,= 2016 at 3:05 PM, DuyHai Doan <doanduyhai@gmail.com> wrot= e:
Are you sure Cassandr= a is a good fit for this kind of heavy update & delete scenario ?
<= br>
Otherwise, you can always use several tables (one table/day, = rotating through 7 days for a week) and do a truncate of the table at the e= nd of the day.

On Tue, Nov 8, 2016 at 11:= 04 AM, Ali Akhtar <ali.rac200@gmail.com> wrote:
I have a use case where a lot of = updates and deletes to a table will be necessary.

The de= letes will be done at a scheduled time, probably at the end of the day, eac= h day.

Updates will be done throughout the day, as= new data comes in.

Are there any guidelines on impr= oving cassandra's performance for this use case? Any caveats to be awar= e of? Any tips, like running nodetool repair every X days?
=
Thanks.


--047d7b873f803d48470540c75b58--