cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa"<jji...@apache.org>
Subject Re: LIKE
Date Fri, 23 Jun 2017 22:55:37 GMT


On 2017-06-20 23:44 (-0700), web master <socketman2016@gmail.com> wrote: 
> I have this table
> 
> CREATE TABLE users_by_username (
>     username text PRIMARY KEY,
>     email text,
>     age int
> )
> 
> I want to run query like the following
> 
> select username from users where username LIKE 'shl%' LIMIT 10;
> 
> 
> Always , I want to find only 10 username (Case insensitive) that start with
> specific characters , How can I do it efficient? I want to read minimum
> partitions and best performance
> 

This type of query may be something you can accomplish with SASI (which another user linked
to you), but it will never be efficient. That is a relatively poor use case for Cassandra.


If you MUST do it, you could probably do it by ordering your partitions by username:

CREATE TABLE usernames_by_prefix (root text, prefix text, username text, PRIMARY KEY(root,
prefix, username)

Then 

INSERT INTO usernames_by_prefix(root, prefix, username) VALUES("je", "jef", "jeff");
INSERT INTO usernames_by_prefix(root, prefix, username) VALUES("je", "jer", "jeremy");
INSERT INTO usernames_by_prefix(root, prefix, username) VALUES("br", "bra", "brandon");

Then you can query usernames_by_prefix where root="je" and prefix >= "jef" and get all
of the username starting with "jef" sorted, quickly and efficiently. 

The risk here is that if you have so many usernames that the size of any partition (where
the patition here is each root, like root="je") grows above ~100MB, you'll start having pretty
bad performance until this flaw of cassandra is fixed in 4.0.



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org


Mime
View raw message