Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1B9B3200C4E for ; Thu, 6 Apr 2017 16:53:47 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 1A373160B83; Thu, 6 Apr 2017 14:53:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 609A0160B84 for ; Thu, 6 Apr 2017 16:53:46 +0200 (CEST) Received: (qmail 48117 invoked by uid 500); 6 Apr 2017 14:53:45 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 48103 invoked by uid 99); 6 Apr 2017 14:53:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Apr 2017 14:53:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 80B5E1A7B72 for ; Thu, 6 Apr 2017 14:53:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id PwYnSaKpT93N for ; Thu, 6 Apr 2017 14:53:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id AAA385FC3D for ; Thu, 6 Apr 2017 14:53:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0AE2AE0AB0 for ; Thu, 6 Apr 2017 14:53:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B649024069 for ; Thu, 6 Apr 2017 14:53:41 +0000 (UTC) Date: Thu, 6 Apr 2017 14:53:41 +0000 (UTC) From: "Romain GERARD (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (CASSANDRA-13418) Allow TWCS to ignore overlaps MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 06 Apr 2017 14:53:47 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959035#comment-15959035 ] Romain GERARD edited comment on CASSANDRA-13418 at 4/6/17 2:53 PM: ------------------------------------------------------------------- I may be wrong but wasn't unchecked_tombstone_compaction combined with tombstone_compaction_interval designed to be used for this use case ? Even if dropping the sstable is more efficient than compacting it, if someone knowledgeable can tell me I would be pleased. I am not against adding an other option, but I would rather have the confidence that I add it out of need rather than because I missed something already existant in cassandra. was (Author: rgerard): I may be wrong but wasn't unchecked_tombstone_compaction combined with tombstone_compaction_interval designed to be used for this use case ? Even if dropping the sstable is more efficient than compacting it. If someone knowledgeable can tell me I would be pleased. I am not against adding an other option, but I would rather have the confidence that I add it out of need rather than because I missed something already existant in cassandra. > Allow TWCS to ignore overlaps > ----------------------------- > > Key: CASSANDRA-13418 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13418 > Project: Cassandra > Issue Type: Improvement > Components: Compaction > Reporter: Corentin Chary > Labels: twcs > > http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html explains it well. If you really want read-repairs you're going to have sstables blocking the expiration of other fully expired SSTables because they overlap. > You can set unchecked_tombstone_compaction = true or tombstone_threshold to a very low value and that will purge the blockers of old data that should already have expired, thus removing the overlaps and allowing the other SSTables to expire. > The thing is that this is rather CPU intensive and not optimal. If you have time series, you might not care if all your data doesn't exactly expire at the right time, or if data re-appears for some time, as long as it gets deleted as soon as it can. And in this situation I believe it would be really beneficial to allow users to simply ignore overlapping SSTables when looking for fully expired ones. > To the question: why would you need read-repairs ? > - Full repairs basically take longer than the TTL of the data on my dataset, so this isn't really effective. > - Even with a 10% chances of doing a repair, we found out that this would be enough to greatly reduce entropy of the most used data (and if you have timeseries, you're likely to have a dashboard doing the same important queries over and over again). > - LOCAL_QUORUM is too expensive (need >3 replicas), QUORUM is too slow. > I'll try to come up with a patch demonstrating how this would work, try it on our system and report the effects. > cc: [~adejanovski], [~rgerard] as I know you worked on similar issues already. -- This message was sent by Atlassian JIRA (v6.3.15#6346)