Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D36C5200C7D for ; Tue, 16 May 2017 16:29:50 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D20E0160BAC; Tue, 16 May 2017 14:29:50 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F169D160B9D for ; Tue, 16 May 2017 16:29:49 +0200 (CEST) Received: (qmail 48996 invoked by uid 500); 16 May 2017 14:29:48 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 48985 invoked by uid 99); 16 May 2017 14:29:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 May 2017 14:29:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 091FAC0027 for ; Tue, 16 May 2017 14:29:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.48 X-Spam-Level: X-Spam-Status: No, score=0.48 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=bamlabs-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id tXdUHKHdFj-t for ; Tue, 16 May 2017 14:29:45 +0000 (UTC) Received: from mail-oi0-f49.google.com (mail-oi0-f49.google.com [209.85.218.49]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 5552F60CF8 for ; Tue, 16 May 2017 14:29:44 +0000 (UTC) Received: by mail-oi0-f49.google.com with SMTP id w10so27435508oif.0 for ; Tue, 16 May 2017 07:29:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bamlabs-com.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ZmHf2DqXM2nIR8HN6FzH6KbbhyhesqoGBRSuUHy2tXs=; b=BIMiDliW4kHsl7KbF6xvLPF6ebZ/vD88cLA4Ylsx7L+rTeTqGWvwz9ypJryZg3J3YM wc/1SY3LyhF8TkgZdlFd6FTaWt5DYaNQ0LwF76q4QC6eHQcPQFzxNhzAFyVGepM8JSUi AG0ObTiF5rLjO/6WBK05vnQ+hNikP7OaKCm7bXvYEAi/KILfFTTtm+u+htRd2Caxl+Uf Zhx+gLA4h+2F1MKlBaNf/ZMa8f31F9rzrVpL/4jP+Imdqt00Gj+xnY9f0byRbXC+xuPZ B3kVNYKLYIUbCvqwDb1afxPv3M6ktddJl1JfTWdZ4ITBCzN/aWXflfwaVk7RSbGQXZvl Ch/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=ZmHf2DqXM2nIR8HN6FzH6KbbhyhesqoGBRSuUHy2tXs=; b=Hn/0Zjp6MZHnaOR7ZFroUp/bsNA3xwegjla6Q/Z1atxjhnCxl27xiLp4ievUXhQWzA mT+enxRvzhAa4j1cOWaz0wiQWy5nhbcjAFWUftQZe1FZx0bpzpdB18HhuhaD9nZ+zNXa +0P2vukHOpAFFflVPdD/9pKbMoRkM9P3nrUZcLPdlQj1Zd+QIYJDi5q7y9v7o1+6Lwvp 0wzdrXafF3TUfxMTJTlMnfoFSQqqOT5P+VehAUuDQ8CpT0nsy9PZCUVnaw2t7/GEDwCF G4vdgLwUTtuxlwAy02FjJfIsWJBbeCyPE3ZAVg68+nKDBI76hiMm9HSwEyH1pYdWlPUh KkWw== X-Gm-Message-State: AODbwcA6BL8x6hfHwzNGyh/JwVUVnckct8ip9PMjNxqg/sieLc5x6m24 DVWa/6YRd+MpAL8z X-Received: by 10.202.230.8 with SMTP id d8mr2103462oih.206.1494944982760; Tue, 16 May 2017 07:29:42 -0700 (PDT) Received: from [192.168.1.73] (107-192-153-18.lightspeed.rcsntx.sbcglobal.net. [107.192.153.18]) by smtp.gmail.com with ESMTPSA id d8sm6868286otd.36.2017.05.16.07.29.42 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 16 May 2017 07:29:42 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: Range deletes, wide partitions, and reverse iterators From: Nitan Kainth X-Mailer: iPhone Mail (14E304) In-Reply-To: <35D525BF-765D-4D5D-BFBD-A8A36EEC53BA@gmail.com> Date: Tue, 16 May 2017 09:29:41 -0500 Cc: Stefano Ortolani , user@cassandra.apache.org Content-Transfer-Encoding: quoted-printable Message-Id: <93D97AF4-758E-46D8-811F-CE3A9BAF7811@bamlabs.com> References: <35D525BF-765D-4D5D-BFBD-A8A36EEC53BA@gmail.com> To: =?utf-8?Q?Hannu_Kr=C3=B6ger?= archived-at: Tue, 16 May 2017 14:29:51 -0000 Hannu, How can you read a partition in reverse?=20 Sent from my iPhone > On May 16, 2017, at 9:20 AM, Hannu Kr=C3=B6ger wrote: >=20 > Well, I=E2=80=99m guessing that Cassandra doesn't really know if the range= tombstone is useful for this or not.=20 >=20 > In many cases it might be that the partition contains data that is within t= he range of the tombstone but is newer than the tombstone and therefore it m= ight be still be returned. Scanning through deleted data can be avoided by r= eading the partition in reverse (if all the deleted data is in the beginning= of the partition). Eventually you will still end up reading a lot of tombst= ones but you will get a lot of live data first and the implicit query limit o= f 10000 probably is reached before you get to the tombstones. Therefore you w= ill get an immediate answer. >=20 > Does it make sense? >=20 > Hannu >=20 >> On 16 May 2017, at 16:33, Stefano Ortolani wrote: >>=20 >> Hi all, >>=20 >> I am seeing inconsistencies when mixing range tombstones, wide partitions= , and reverse iterators. >> I still have to understand if the behaviour is to be expected hence the m= essage on the mailing list. >>=20 >> The situation is conceptually simple. I am using a table defined as follo= ws: >>=20 >> CREATE TABLE test_cql.test_cf ( >> hash blob, >> timeid timeuuid, >> PRIMARY KEY (hash, timeid) >> ) WITH CLUSTERING ORDER BY (timeid ASC) >> AND compaction =3D {'class' : 'LeveledCompactionStrategy'}; >>=20 >> I then proceed by loading 2/3GB from 3 sstables which I know contain a re= ally wide partition (> 512 MB) for `hash =3D x`. I then delete the oldest _h= alf_ of that partition by executing the query below, and restart the node: >>=20 >> DELETE=20 >> FROM test_cql.test_cf=20 >> WHERE hash =3D x AND timeid < y; >>=20 >> If I keep compactions disabled the following query timeouts (takes more t= han 10 seconds to=20 >> succeed): >>=20 >> SELECT *=20 >> FROM test_cql.test_cf=20 >> WHERE hash =3D 0x963204d451de3e611daf5e340c3594acead0eaaf=20 >> ORDER BY timeid ASC; >>=20 >> While the following returns immediately (obviously because no deleted dat= a is ever read): >>=20 >> SELECT *=20 >> FROM test_cql.test_cf=20 >> WHERE hash =3D 0x963204d451de3e611daf5e340c3594acead0eaaf=20 >> ORDER BY timeid DESC; >>=20 >> If I force a compaction the problem is gone, but I presume just because t= he data is rearranged. >>=20 >> It seems to me that reading by ASC does not make use of the range tombsto= ne until C* reads the >> last sstables (which actually contains the range tombstone and is flushed= at node restart), and it wastes time reading all rows that are actually not= live anymore.=20 >>=20 >> Is this expected? Should the range tombstone actually help in these cases= ? >>=20 >> Thanks a lot! >> Stefano >=20 >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org > For additional commands, e-mail: user-help@cassandra.apache.org >=20 --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org For additional commands, e-mail: user-help@cassandra.apache.org