incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Tang <ares.t...@gmail.com>
Subject Re: Cassandra search performance
Date Wed, 25 Apr 2012 14:45:33 GMT
1.0.8

在 2012年4月25日 下午10:38,Philip Shon <philip.shon@gmail.com>写道:

> what version of cassandra are you using.  I found a big performance hit
> when querying on the secondary index.
>
> I came across this bug in versions prior to 1.1
>
> https://issues.apache.org/jira/browse/CASSANDRA-3545
>
> Hope that helps.
>
> 2012/4/25 Jason Tang <ares.tang@gmail.com>
>
>> And I found, if I only have the search condition "status", it only scan
>> 200 records.
>>
>> But if I combine another condition "partition" then it scan all records
>> because "partition" condition match all records.
>>
>> But combine with other condition such as "userName", even all "userName"
>> is same in the 1,000,000 records, it only scan 200 records.
>>
>> So it impacted by scan execution plan, if we have several search
>> conditions, how it works? Do we have the similar execution plan in
>> Cassandra?
>>
>>
>> 在 2012年4月25日 下午9:18,Jason Tang <ares.tang@gmail.com>写道:
>>
>> Hi
>>>
>>>    We have the such CF, and use secondary index to search for simple
>>> data "status", and among 1,000,000 row records, we have 200 records with
>>> status we want.
>>>
>>>   But when we start to search, the performance is very poor, and check
>>> with the command "./bin/nodetool -h localhost -p 8199 cfstats" , Cassandra
>>> read 1,000,000 records, and "Read Latency" is 0.2 ms, so totally it used
>>> 200 seconds.
>>>
>>>   It use lots of CPU, and check the stack, all thread in Cassandra is
>>> read from socket.
>>>
>>>   So I wonder, how to really use index to find the 200 records instead
>>> of scan all rows. (Supper Column?)
>>>
>>> *ColumnFamily: queue*
>>> *      Key Validation Class: org.apache.cassandra.db.marshal.BytesType*
>>> *      Default column value validator:
>>> org.apache.cassandra.db.marshal.BytesType*
>>> *      Columns sorted by: org.apache.cassandra.db.marshal.BytesType*
>>> *      Row cache size / save period in seconds / keys to save :
>>> 0.0/0/all*
>>> *      Row Cache Provider:
>>> org.apache.cassandra.cache.ConcurrentLinkedHashCacheProvider*
>>> *      Key cache size / save period in seconds: 0.0/0*
>>> *      GC grace seconds: 0*
>>> *      Compaction min/max thresholds: 4/32*
>>> *      Read repair chance: 0.0*
>>> *      Replicate on write: false*
>>> *      Bloom Filter FP chance: default*
>>> *      Built indexes: [queue.idxStatus]*
>>> *      Column Metadata:*
>>> *        Column Name: status (737461747573)*
>>> *          Validation Class: org.apache.cassandra.db.marshal.AsciiType*
>>> *          Index Name: idxStatus*
>>> *          Index Type: KEYS*
>>> *
>>> *
>>> BRs
>>>  //Jason
>>>
>>
>>
>

Mime
View raw message