incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: slow insertion rate with secondary index
Date Mon, 06 Jun 2011 03:38:38 GMT
Index updates require read-before-write (to find out what the prior
version was, if any, and update the index accordingly).  This is
random i/o.

Index creation on the other hand is a lot of sequential i/o, hence
more efficient.

So, the classic bulk load advice to ingest data prior to creating
indexes applies.

On Sun, Jun 5, 2011 at 5:47 PM, Donal Zang <zangds@ihep.ac.cn> wrote:
> I did a insertion test with and without secondary indexes, and found that:
> Without secondary index: ~10864 rows inserted per second
> With secondary index on one column(BytesType): ~1515 rows inserted per
> second
> Is this normal? why secondary index would have so much affect?
>
> I noticed that If I build the index using “update column family ...” after I
> inserted all data (90578207 rows) , It will finish very quickly.
> I'm not very clear about how the secondary index works, will some one
> explain this ?
> Thanks!
> Donal
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message