incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian O'Neill <boneil...@gmail.com>
Subject Re: How to manually build and maintain secondary indexes
Date Thu, 26 Jul 2012 18:13:45 GMT
Alon,

We came to the same conclusion regarding secondary indexes, and instead of
using them we implemented our own wide-row indexing capability and
open-sourced it.  

Its available here:
https://github.com/hmsonline/cassandra-indexing

We still have challenges rebuilding indexes, etc.  It doesn't address all
of your concerns, but I tried to capture the motivation behind our
implementation here:
http://brianoneill.blogspot.com/2012/03/cassandra-indexing-good-bad-and-ugl
y.html

-brian

-- 
Brian O'Neill
Lead Architect, Software Development
Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
p: 215.588.6024
www.healthmarketscience.com





On 7/26/12 2:05 PM, "Alon Pilberg" <alon.p@taboola.com> wrote:

>Hello,
>My company is working on transition of our relational data model to
>Cassandra. Naturally, one of the basic demands is to have secondary
>indexes to answer queries quickly according to the application's
>needs.
>After looking at Cassandra's native support for secondary indexes, we
>decided not to use them due to the poor performance for
>high-cardinality values. Instead, we decide to implement secondary
>indexes manually.
>Some search led us to
>http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html which
>details a schema for such indexes. However, the method employed there
>specifically adds an index entries column family, whereas it seems
>like only 2 CFs are needed - one for the items and one for the indexes
>(assuming one has access to both old and new values when updating an
>item). The article actually mentioned that this is indeed not the
>obvious solution, "for a number of reasons related to Cassandra's
>model of eventual consistency ... will not reliably work" and "it's a
>really good idea to make sure you understand why this CF is
>necessary". However, no additional information is provided on what
>might be a critical issue, as dealing with corrupt indexes in a large
>production environment is surely to be a nightmare.
>What are the community's thoughts on this matter? Given the writer's
>credentials in the Cassandra realm, specifically regarding indexes,
>I'm inclined not to ignore his remarks.
>References to a document / system that implement similar indexes would
>be greatly appreciated as well.
>
>- alon



Mime
View raw message