cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Dejanovski <a...@thelastpickle.com>
Subject Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?
Date Tue, 09 May 2017 14:32:57 GMT
Hi Kant,

Unless you provide the full partition key, I see no way for Cassandra to
avoid doing a full table scan.
In order to know on which specific nodes to search (and in which sstables
,etc...) it needs to have a token. The token is a hash of the whole
partition key.
For a specific value of column "a" and different values of column "b" you
always end up with different tokens that have no guaranty to be stored on
the same node.
After that, bloom filters, partition indexes, etc... require the full token
too, so a full scan is further necessary on each node to get the data.

TL;DR : no way to avoid a full cluster scan unless you provide the full
partition key in your where clause.

Cheers,

On Tue, May 9, 2017 at 4:24 PM Jon Haddad <jonathan.haddad@gmail.com> wrote:

> Nope, I didn’t comment on that query.   I specifically answered your
> question about "select * from hello where a='foo' allow filtering;”
>
> The query you’ve listed here looks like it would also do a full table scan
> (again, I don’t see how it would be avoided).
>
> I recommend firing up a 3 node cluster using CCM, creating a key space
> with RF=1, and seeing what it does.
>
> On May 9, 2017, at 9:12 AM, Kant Kodali <kant@peernova.com> wrote:
>
> Hi,
>
> Are you saying The following query select max(b) from hello where a='a1'
> allow filtering; doesn't result in a table scan? I got the result for
> this query and yes I just tried tracing it and looks like it is indeed
> doing a table scan on ReadStage-2 although I am not sure if I am
> interpreting it right? Finally is there anyway to prevent table scan while
> providing the partial partition key and get the max b ?
>
> <Screen Shot 2017-05-09 at 7.07.46 AM.png>
> ​
>
>
> On Tue, May 9, 2017 at 6:33 AM, Jon Haddad <jonathan.haddad@gmail.com>
> wrote:
>
>> I don’t see any way it wouldn’t.  Have you tried tracing it?
>>
>> > On May 9, 2017, at 8:32 AM, Kant Kodali <kant@peernova.com> wrote:
>> >
>> > Hi All,
>> >
>> > It looks like Cassandra 3.10 has partial partition key search but does
>> it result in a table scan? for example I can have the following
>> >
>> > create table hello(
>> > a text,
>> > b int,
>> > c text,
>> > d text,
>> > primary key((a,b), c)
>> > );
>> >
>> > Now I can do select * from hello where a='foo' allow filtering;// This
>> works in 3.10 but I wonder if this query results in table scan and if so is
>> there any way to limit such that I get max b?
>> >
>> > Thanks!
>>
>>
> --
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Mime
View raw message