incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anand Somani <>
Subject Re: Secondary indexes for multi-value fields
Date Wed, 22 Dec 2010 16:17:34 GMT
One approach is to ask yourself questions as to how you would use this
information, for example

   - how often to you go from user to tags
   - how often would you want to go from tag->users.
   - What kind of reporting would you want to do on tags and how often
   - Can multiple people add the same tag to the same user, are they
   maintained separately
   - Given your business, how many users do you expect
   - etc.

Depending on that one approach might work better than other. I have not used
indexes/non id based searches (do not have that use case) in Cassandra yet,
so this is just based on time I have spend reading about it.

One approach using indexes was given by Jool, the other approach is using
reverse indexes

   - 2 CF - one for user and one for tags (reverse index)
   - User - might need to have a SC - with tags and some information like
   who tagged it
   - Tag - tag to column of users
   - Advantage: -
   - 1 query to find user->tags on user CF
      - tag->users - on tag CF (I would think this would be more efficient
      than user->tags since that will potentially hit multiple
rows/nodes, unless
      I have misunderstood secondary indexes)
      - Disadvantage
      - Need to write to couple of CF, but writes are relatively cheaper
      than reads in Cassandra
      - Since you update 2 CF and there are no transaction, one might
      succeed and the other might fail

Even with the other suggestion of indexes you can still add the tag->users.

On Wed, Dec 22, 2010 at 4:54 AM, Prasad Sunkari <> wrote:

> Hi all,
> I have a column family for users of my system and I need to have tags set
> to these users.  My current plan is to have a column that holds a string
> (comma separated tags).
> I am not clear if this the best way to do it.  Specially because this may
> lead to a complications when more than one administrator is trying to tag
> the same user (lost updates) as well as the secondary indexes (if I wanted
> to use the built in secondary indexes).  I also am not sure if it is
> possible to have a secondary index on a multi-valued column!
> Another alternative is to have it in a super column with each tag being a
> column by itself and let my application take care of the secondary indexes.
> I am currently of the opinion that the second solution is the only thing
> that I could do.
> Any suggestions?  Since this is my first app on Cassandra I am trying to
> see if my opinion is correct.
> Thanks,
> Prasad

View raw message