cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-11067) Improve SASI syntax
Date Thu, 04 Feb 2016 01:46:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131534#comment-15131534
] 

Pavel Yaskevich edited comment on CASSANDRA-11067 at 2/4/16 1:46 AM:
---------------------------------------------------------------------

If you don't add an analyzer to the column which does stemming and tokenization it would work
exactly how you describe - "distributing" would return 0 results and whole string would be
1, it's tokenization feature which makes it work the way it does in the example because after
tokenization every term in of that string is a separate entity, and even more in case of "distributing"
- only it's stem is going to be saved which is "distribut" that's why matching "distributing"
vs. "distribution" which is an original value is going to produce results, but to make it
work multiple additional SASI options are needed, by default it's not going to do any of that
and going to behave like you describe.


was (Author: xedin):
If you don't add an analyzer to the column which does stemming and tokenization it would work
exactly how you describe - "distributing" would return 0 results and whole string would be
1, it's tokenization feature which makes it work the way it does in the example because after
tokenization every term in the of that string is a separate entity and even more in case of
"distributing" only it's stem is going to be saved which is "distribut" that's why matching
"distributing" vs. "distribution" which is an original value is going to produce results,
but to make it work there are multiple additional SASI options needed, by default it's not
going to do any of that. 

> Improve SASI syntax
> -------------------
>
>                 Key: CASSANDRA-11067
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11067
>             Project: Cassandra
>          Issue Type: Task
>          Components: CQL
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>             Fix For: 3.4
>
>
> I think everyone agrees that a LIKE operator would be ideal, but that's probably not
in scope for an initial 3.4 release.
> Still, I'm uncomfortable with the initial approach of overloading = to mean "satisfies
index expression."  The problem is that it will be very difficult to back out of this behavior
once people are using it.
> I propose adding a new operator in the interim instead.  Call it MATCHES, maybe.  With
the exact same behavior that SASI currently exposes, just with a separate operator rather
than being rolled into =.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message