incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Donal Zang <>
Subject Re: [SPAM] Re: slow insertion rate with secondary index
Date Mon, 06 Jun 2011 11:28:10 GMT
On 06/06/2011 10:15, David Boxenhorn wrote:
> Is there really a 10x difference between indexed CFs and non-indexed CFs? 
Well, as for my test, it is!
I'm using 0.7.6-2, 9 nodes, 3 replicas, write_consistency_level QUORUM, 
about 90,000,000 rows (~ 1K per row)
I use 20 process, 20rows for each insertion.
the insertion time for the whole row is about 0.02 seconds without index
and then I add a secondary index, and update every row with the indexed 
column, the insertion time is about 2 seconds
and if I remove the index, and update the column, the time is about 0.002

Another thing I noticed is : if you first do insertion, and then build 
the secondary index use "update column family ...", and then do select 
based on the index, the result is not right (seems the index is still 
being built though the "update" commands returns quickly). And after a 
while, the get_indexed_slices() goes time out from time to time (with 
pycassa.ConnectionPool('keyspace1', ['host1','host2'], timeout=600, 
pool_size=1) ).

Does some one else have some same experiences using the secondary indexes?

Donal Zang
Computing Center, IHEP
19B YuquanLu, Shijingshan District,Beijing, 100049
86 010 8823 6018

View raw message