cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DOAN DuyHai (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10661) Integrate SASI to Cassandra
Date Fri, 22 Jan 2016 20:58:40 GMT


DOAN DuyHai commented on CASSANDRA-10661:

Hello [~xedin], it's me again.

 I've had some discussion with search people and they told me that wildcard searches (name
like "\*xxxxx\*") are very expensive. Classical data structure like suffix trees are adapted
for suffix searching (name like "xxx\*"). For prefix search (name like "\*xxx") they're creating
a *reversed* index. Does it mean that the CONTAINS mode (formerly named SUFFIX) is more expensive
than the NORMAL search mode ? If yes, how much expensive is it (x2 ? order of magnitude ?)

 Second question, more related to the impl, since you query the nodes following the token
range and do not hit all nodes like normal secondary index, does it imply that *sorting* 
(ORDER BY) is no longer relevant since you do not retrieve all possible results ? (I've seen
in QueryPlan.MAX_ROWS that there is a hard-coded limit of 10 000 results)

 Sorry to annoy you with my questions but they are important so that we, evangelists, can
give the right use-cases for users and especially deter them from mis-using SASI when it's
not appropriate or when the search cost is prohibitive.

> Integrate SASI to Cassandra
> ---------------------------
>                 Key: CASSANDRA-10661
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Pavel Yaskevich
>            Assignee: Pavel Yaskevich
>              Labels: sasi
>             Fix For: 3.x
> We have recently released new secondary index engine (
build using SecondaryIndex API, there are still couple of things to work out regarding 3.x
since it's currently targeted on 2.0 released. I want to make this an umbrella issue to all
of the things related to integration of SASI, which are also tracked in [sasi_issues|],
into mainline Cassandra 3.x release.

This message was sent by Atlassian JIRA

View raw message