incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <>
Subject Re: How to create data model from RDBMS ERD
Date Thu, 23 Jun 2011 16:55:09 GMT
On Thu, Jun 23, 2011 at 12:43 PM, mcasandra <> wrote:

> How should one go about creating a data model from RDBMS ER into Big Table
> Data model? For eg: RDBMS has many indexes required for queries and I think
> this is the most important aspect when desiging the data model in Big
> Table.
> I was initially planning to denormalize into one CF and use secondary
> indexes. However I also read that creating secondary indexes have
> performance impact. So other option is to create inverted index. But it
> also
> seems to be bad to have too many CFs. We have requirements to support high
> volume min of 500 writes + 500 reads per sec.
> What would you advise?
> --
> View this message in context:
> Sent from the mailing list archive at

>From a high level prospective secondary indexes are identical to creating
your own inverted index. Indexes store data in a different ordering to
accelerate certain types of searches. Every extra index requires extra disk
space, and each insertion requires updating more structures.

In an RDBMS world indexes are usually placed on foreign keys or things in
the where clause. You do not want to have unnecessary indexes because that
slows the write path.

A good quote is "design your data around your access requirements". Thus if
you need to be able to quickly search the data on a certain column, you need
the index. There is no way around that, if it hurts performance you need
more hardware.

If secondary indexes in Cassandra do what you need you should use that will
save you design time. If they do not do exact what you need you may have to
create your own inverted index that involves a second Column Family.

View raw message