Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E2402191A1 for ; Mon, 4 Apr 2016 13:43:25 +0000 (UTC) Received: (qmail 94254 invoked by uid 500); 4 Apr 2016 13:43:25 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 94216 invoked by uid 500); 4 Apr 2016 13:43:25 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 94203 invoked by uid 99); 4 Apr 2016 13:43:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Apr 2016 13:43:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 86A752C1F60 for ; Mon, 4 Apr 2016 13:43:25 +0000 (UTC) Date: Mon, 4 Apr 2016 13:43:25 +0000 (UTC) From: "Aleksey Yeschenko (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-11432) Counter values become under-counted when running repair. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-11432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224139#comment-15224139 ] Aleksey Yeschenko commented on CASSANDRA-11432: ----------------------------------------------- [~dikanggu] As a matter of fact, yes, yes you can (: 1. Is you cluster a fresh 2.2 one? More specifically, does it by any chance have 2.0 or older generated counters? 2. How large is larger than 1%? 3. Can you observe the same thing without repair running? 4. Have you observed any timeouts? What to you do in case of a timeout? Ignore or retry? Counter updates are not idempotent, so if you retry a timed out increment, you have a real risk of overcounting (in case the update made it, but client timed out). If you ignore instead, than a missed increment would undercount. Another case that would cause an undercount is a retried decrement, of course. 5. What's your commit log policy? If sync, what the sync period? Have you observed any node failures during the experiment that would cause any commit log loss? I've had another look at the code, and nothing popped out at me, really. Gotta be either timeouts (maybe you time out more often during repair load?), or crashed nodes and subsequent commit log loss. Or, of course, I really am missing something esoteric. > Counter values become under-counted when running repair. > -------------------------------------------------------- > > Key: CASSANDRA-11432 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11432 > Project: Cassandra > Issue Type: Bug > Reporter: Dikang Gu > Assignee: Aleksey Yeschenko > > We are experimenting Counters in Cassandra 2.2.5. Our setup is that we have 6 nodes, across three different regions, and in each region, the replication factor is 2. Basically, each nodes holds a full copy of the data. > We are writing to cluster with CL = 2, and reading with CL = 1. > When are doing 30k/s counter increment/decrement per node, and at the meanwhile, we are double writing to our mysql tier, so that we can measure the accuracy of C* counter, compared to mysql. > The experiment result was great at the beginning, the counter value in C* and mysql are very close. The difference is less than 0.1%. > But when we start to run the repair on one node, the counter value in C* become much less than the value in mysql, the difference becomes larger than 1%. > My question is that is it a known problem that the counter value will become under-counted if repair is running? Should we avoid running repair for counter tables? > Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)