Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A721D200D25 for ; Thu, 14 Sep 2017 00:32:10 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A5C391609CA; Wed, 13 Sep 2017 22:32:10 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id ED6FA1609CC for ; Thu, 14 Sep 2017 00:32:09 +0200 (CEST) Received: (qmail 61419 invoked by uid 500); 13 Sep 2017 22:32:08 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 61340 invoked by uid 99); 13 Sep 2017 22:32:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Sep 2017 22:32:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 4EBFC1A6264 for ; Wed, 13 Sep 2017 22:32:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.001 X-Spam-Level: X-Spam-Status: No, score=-100.001 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 991E8N99g5Ue for ; Wed, 13 Sep 2017 22:32:07 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 39A6B5FC80 for ; Wed, 13 Sep 2017 22:32:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6B60EE0F0F for ; Wed, 13 Sep 2017 22:32:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 96FCF2538F for ; Wed, 13 Sep 2017 22:32:02 +0000 (UTC) Date: Wed, 13 Sep 2017 22:32:02 +0000 (UTC) From: "Jeff Jirsa (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-12872) Paging reads and limit reads are missing some data MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 13 Sep 2017 22:32:10 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-12872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-12872: ----------------------------------- Labels: Correctness (was: ) > Paging reads and limit reads are missing some data > -------------------------------------------------- > > Key: CASSANDRA-12872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12872 > Project: Cassandra > Issue Type: Bug > Components: Coordination > Reporter: Bhaskar Muppana > Assignee: Benjamin Lerer > Priority: Critical > Labels: Correctness > Attachments: limiterr-reproduce.sh > > > We are seeing an issue with paging reads missing some small number of columns when we do paging/limit reads. We get this on a single DC cluster itself when both reads and writes are happening with QUORUM. Paging/limit reads see this issue. I have attached the ccm based script which reproduces the problem. > * Keyspace RF - 3 > * Table (id int, course text, marks int, primary key(id, course)) > * replicas for partition key 1 - r1, r2 and r3 > * insert (1, '1', 1) , (1, '2', 2), (1, '3', 3), (1, '4', 4), (1, '5', 5) - succeeded on all 3 replicas > * insert (1, '6', 6) succeeded on r1 and r3, failed on r2 > * delete (1, '2'), (1, '3'), (1, '4'), (1, '5') succeeded on r1 and r2, failed on r3 > * insert (1, '7', 7) succeeded on r1 and r2, failed on r3 > Local data on 3 nodes looks like as below now > r1: (1, '1', 1), tombstone(2-5 records), (1, '6', 6), (1, '7', 7) > r2: (1, '1', 1), tombstone(2-5 records), (1, '7', 7) > r3: (1, '1', 1), (1, '2', 2), (1, '3', 3), (1, '4', 4), (1, '5', 5), (1, '6', 6) > If we do a paging read with page_size 2, and if it gets data from r2 and r3, then it will only get the data (1, '1', 1) and (1, '7', 7) skipping record 6. This problem would happen if the same query is not doing paging but limit set to 2 records. > Resolution code for reads works same for paging queries and normal queries. Co-ordinator shouldn't respond back to client with records/columns that it didn't have complete visibility on all required replicas (in this case 2 replicas). In above case, it is sending back record (1, '7', 7) back to client, but its visibility on r3 is limited up to (1, '2', 2) and it is relying on just r2 data to assume (1, '6', 6) doesn't exist, which is wrong. End of the resolution all it can conclusively say any thing about is (1, '1', and the other one is that we and and and and and and the and the and the and d and the other is and 1), which exists and (1, '2', 2), which is deleted. > Ideally we should have different resolution implementation for paging/limit queries. > We could reproduce this on 2.0.17, 2.1.16 and 3.0.9. > Seems like 3.0.9 we have ShortReadProtection transformation on list queries. I assume that is to protect against the cases like above. But, we can reproduce the issue in 3.0.9 as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org For additional commands, e-mail: commits-help@cassandra.apache.org