Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2A7DF200D1D for ; Sat, 30 Sep 2017 01:03:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 28FD01609ED; Fri, 29 Sep 2017 23:03:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 47DBA1609D1 for ; Sat, 30 Sep 2017 01:03:07 +0200 (CEST) Received: (qmail 81293 invoked by uid 500); 29 Sep 2017 23:03:06 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 81280 invoked by uid 99); 29 Sep 2017 23:03:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Sep 2017 23:03:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id C93831A1F64 for ; Fri, 29 Sep 2017 23:03:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id n8DCXuP8OtWO for ; Fri, 29 Sep 2017 23:03:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 522615FD1B for ; Fri, 29 Sep 2017 23:03:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5FF50E09A6 for ; Fri, 29 Sep 2017 23:03:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id DD697242CB for ; Fri, 29 Sep 2017 23:03:00 +0000 (UTC) Date: Fri, 29 Sep 2017 23:03:00 +0000 (UTC) From: "Jeremiah Jordan (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-13910) Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 29 Sep 2017 23:03:08 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186643#comment-16186643 ] Jeremiah Jordan commented on CASSANDRA-13910: --------------------------------------------- bq. Asked differently: why would someone set read_repair_chance to a high value? If someone had a multi-dc setup, wrote with LOCAL_ cls, and then read from that DC, they'd trigger the async repair into the other DC, working around the dc local consistency without paying the latency cost on the app side. Sort of a "I can force hint delivery on read without waiting for it" technique. I don't know how common such a use case would be, but I would believe that someone, somewhere is relying on it. I definitely have done exactly this in the past when hints were WAY less reliable. With the current hints implementation that is pretty reliable, I do not see the need for doing something like this anymore. > Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance > ---------------------------------------------------------------------------------- > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement > Reporter: Sylvain Lebresne > Priority: Minor > Labels: CommunityFeedbackRequested > > First, let me clarify so this is not misunderstood that I'm not *at all* suggesting to remove the read-repair mechanism of detecting and repairing inconsistencies between read responses: that mechanism is imo fine and useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} have never been about _enabling_ that mechanism, they are about querying all replicas (even when this is not required by the consistency level) for the sole purpose of maybe read-repairing some of the replica that wouldn't have been queried otherwise. Which btw, bring me to reason 1 for considering their removal: their naming/behavior is super confusing. Over the years, I've seen countless users (and not only newbies) misunderstanding what those options do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially nowadays, more harmful than anything else when enabled. When those option kick in, what you trade-off is additional resources consumption (all nodes have to execute the read) for a _fairly remote chance_ of having some inconsistencies repaired on _some_ replica _a bit faster_ than they would otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the case where a node stay dead for a long time so that hints ends up timing-out, you really should repair the node when it comes back (if not simply re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in the first place. > # again, read-repair do happen without those options kicking in. If you do reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best (because the extra resources consumption is real), so at the end of the day, it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off really make sense. Don't get me wrong, those options had their places a long time ago when hints weren't working all that well, but I think they bring more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to clean up anything that may not make all that much sense anymore, which I think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and well overshadowed by the confusion they bring, and not worth maintaining the code that supports them (which, to be fair, isn't huge, but getting rid of {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in weird case and that we fill supporting those cases is worth confusing everyone else and maintaining that code, I would still suggest disabling them totally by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org For additional commands, e-mail: commits-help@cassandra.apache.org