cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Index search in provided list of rows (list of rowKeys).
Date Wed, 14 Sep 2011 21:23:32 GMT
The way specify more restrictions to the query is to specify them in the index_clause.  The
index clause is applied to the set of all rows in the database, not a sub set, applying them
to a sub set is implicitly supporting a sub query. Currently it's doing "select then project",
this would be "select then select then project".

Right now I would use Solandra, or do the entire search in Sphinx and get the row keys for
the result documents. In the future you may be able to use this


Aaron Morton
Freelance Cassandra Developer

On 15/09/2011, at 12:46 AM, Evgeniy Ryabitskiy wrote:

> Why it's radically?
> It will be same get_indexes_slices search but in specified set of rows. So mostly it
will be one more Search Expression over rowIDs not only column values. Usually the more restrictions
you could specify in search query, the faster search it can be (not slower at least).
> About moving to another engine:
> Sphinx has it's advantages (quite fast) and disadvantages (painful integration, lot's
of limitations). Currently my company using it on production, so moving to another search
engine is a big step and it will be considered.
> What I want to discuss is common task of searching in Cassandra. Maybe I missing some
already well known solution for it (silver bullet)?
> I see only 2 solutions:
> 1) Using external search engine that will index all storage fields
> advantage:
>  support full text search
> some engines have nice search features like "sorting by relevance"
> disadvantage: 
> for range scans it stores column values, it mean that huge part of cassandra data will
be also stored at Search Engine metadata
> usually engines have set of limitations
> 2) Use Cassandra embedded Indexing search
> advantage: 
> doesn't need to index all columns that are used for filtering. 
> Filtering performed at storage, close to data.
> disadvantage: 
> not full text search support
> require to create and maintain secondary indexes.
> Both solutions are exclusive, you could choose only one and there is no way to use combination
of this 2 solutions (except intersection at client side which is not a solution).
> So API that was discussed would open some possibility to use that combination. 
> For me it looks like third solution. Could it really change the way we are searching
in Cassandra?
> Evgeny.

View raw message