cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-10661) Integrate SASI to Cassandra
Date Sat, 23 Jan 2016 16:56:40 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113836#comment-15113836
] 

Jack Krupansky edited comment on CASSANDRA-10661 at 1/23/16 4:55 PM:
---------------------------------------------------------------------

Is there also a way to query a SASI-indexed column by exact value? I mean, it seems as if
by enabling prefix or contains, that it will always query by prefix or contains. For example,
if I want to query for full first name, like where their full first name really is "J" and
not get "John" and "James" as well, while at other times I am indeed looking for names starting
with a prefix of "Jo" for "John", "Joseph", etc.

Or, can I indeed have two indexes on a single column, one a traditional exact match, and one
a prefix match. Hmmm... in which case, which gets used if I just specify a column name?

CREATE INDEX first_name_full ON mytable (first_name)...
CREATE CUSTOM INDEX first_name_prefix ON mytable (first_name)...

It would be good to have an example that illustrates this. In fact, I would argue that first
and last names are perfect examples of where you really do need to query on both exact match
and partial match. In fact, I'm not sure I can think of any examples of non-tokenized text
fields where you don't want to reserve the ability to find an exact match even if you do need
partial matches for some queries.

Will SPARSE mode in fact give me an exact match? (Sounds like it.) In which case, would I
be better off with a SPARSE index for first_name_full, or would a traditional Cassandra non-custom
index work fine (or even better.)

Are there any use cases of traditional Cassandra indexes which shouldn't almost automatically
be converted to SPARSE. After all, the current recommended best practice is to avoid secondary
indexes where the column cardinality is either very high or very low, which seems to be a
match for SPARSE, although the precise meaning of SPARSE is still a bit fuzzy for me.


was (Author: jkrupan):
Is there also a way to query a SASI-indexed column by exact value? I mean, it seems as if
by enabling prefix or contains, that it will always query by prefix or contains. For example,
if I want to query for full first name, like where their full first name really is "J" and
not get "John" and "James" as well, while at other times I am indeed looking for names starting
with a prefix of "Jo" for "John", "Joseph", etc.

Or, can I indeed have two indexes on a single column, one a traditional exact match, and one
a prefix match. Hmmm... in which case, which gets used if I just specify a column name?

CREATE INDEX first_name_full ON table 
CREATE CUSTOM INDEX first_name_prefix ...

It would be good to have an example that illustrates this. In fact, I would argue that first
and last names are perfect examples of where you really do need to query on both exact match
and partial match. In fact, I'm not sure I can think of any examples of non-tokenized text
fields where you don't want to reserve the ability to find an exact match even if you do need
partial matches for some queries.

Will SPARSE mode in fact give me an exact match? (Sounds like it.) In which case, would I
be better off with a SPARSE index for first_name_full, or would a traditional Cassandra non-custom
index work fine (or even better.)

Are there any use cases of traditional Cassandra indexes which shouldn't almost automatically
be converted to SPARSE. After all, the current recommended best practice is to avoid secondary
indexes where the column cardinality is either very high or very low, which seems to be a
match for SPARSE, although the precise meaning of SPARSE is still a bit fuzzy for me.

> Integrate SASI to Cassandra
> ---------------------------
>
>                 Key: CASSANDRA-10661
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10661
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: sasi
>             Fix For: 3.x
>
>
> We have recently released new secondary index engine (https://github.com/xedin/sasi)
build using SecondaryIndex API, there are still couple of things to work out regarding 3.x
since it's currently targeted on 2.0 released. I want to make this an umbrella issue to all
of the things related to integration of SASI, which are also tracked in [sasi_issues|https://github.com/xedin/sasi/issues],
into mainline Cassandra 3.x release.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message