incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From August Zajonc <augu...@augustz.com>
Subject Design Pattern - Tag Cloud / Inverted Index
Date Sun, 27 Dec 2009 08:52:28 GMT
Just getting up to speed with Cassandra and the terminology and
concepts. Lots of fun. Apologies for a beginner question.

Focusing on how the data will be used is very helpful in coming up
with a model.

I have a question about a design pattern I expect is relatively
common. It's the tag pattern (or sets or inverted index...)

Item A
- Tag A
- Tag B
- Tag C

Tag A
- Item A
- Item B
- Item C

I want fast access using the item as a key and fast access when using
the tag as a key (inverted index). This let's met answer the
questions, what tags have been applied to this item, and what items
belong to this tag, both very quickly.

Looking at the data model a simple solution is two column families,
one containing items as the row-key with tags as columns, and a second
with tags as the row-key with items as columns. This gives me fast
access at the cost of 2x the writes (cheap) and storage (also cheap).
So not bad.

Curious what the standard pattern is here.

One other issue is consistency. Since my writes are idempotent
consistency could be maintained with a writelog or similar replayed on
recovery. It feels like this will have been done enough someone will
have some good pointers.

Many cheers,

- August

Mime
View raw message