incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: various Cassandra performance problems when CQL3 is really used
Date Wed, 18 Dec 2013 03:53:09 GMT
> * select id from table where token(id) > token(some_value) and secondary_index = other_val
limit 2 allow filtering;
> 
> Filtering absolutely kills the performance. On a table populated with 130.000 records,
single node Cassandra server (on my i7 notebook, 2GB of JVM heap) and secondary index built
on column with low cardinality of its value set this query takes 156 seconds to finish.
Yes, this is why you have to add allow_filtering. You are asking the nodes to read all the
data that matches and filter in memory, that’s a SQL type operation. 

Your example query is somewhat complex and I doubt it could get decent performance, what does
the query plan look like? 

IMHO you need to do further de-normalisation, you will get the best performance when you select
rows by their full or part primary key.

 
> By the way, the performance is order of magnitude better if this patch is applied:
That looks like it’s tuned to your specific need, it would ignore the max results included
in the query. 

> * select id from table;
> 
> As we saw in the trace log, the query - although it queries just row ids - scans all
columns of all the rows and (probably) compares TTL with current time (?) (we saw hundreds
of thousands of gettimeofday(2)). This means that if the table somehow mixes wide and narrow
rows, the performance suffers horribly.
Select all rows from a table requires a range scan, which reads all rows from all nodes. It
should never be used production. 

Not sure what you mean by “scans all columns from all rows” a select by column name will
use a SliceByNamesReadCommand which will only read the required columns from each SSTable
(it normally short circuits though and read from less). 

if there is a TTL the ExpiringColumn.localExpirationTime must be checked, if there is no TTL
it will no be checked. 

> As Cassandra checks all the columns in selects, performance suffers badly if the collection
is of any interesting size.
This is not true, could you provide an example where you think this is happening ? 

> Additionally, we saw various random irreproducible freezes, high CPU consumption when
nothing happens (even with trace log level set no activity was reported) and highly inpredictable
performance characteristics after nodetool flush and/or major compaction.
What was the HW platform and what was the load ?
Typically freezes in the server correlate to JVM GC, the JVM GC can also be using the CPU.

If you have wide rows or make large reads you may run into more JVM GC issues. 

nodetool flush will (as it says) flush all the tables to disk, if you have a lot tables and/or
a lot of secondary indexes this can cause the switch lock to be held preventing write threads
from progressing. Once flush threads stop waiting on the flush queue the lock will be released.
See the help for memtable_flush_queue_size in the yaml file. 

major compaction is not recommended to be used in production. If you are seeing it cause performance
problems I would guess it is related to JVM GC and/or the disk IO is not able to keep up.
When used it creates a single SSTable for each table which will not be compacted again until
(default) 3 other large SSTables are created or you run major compaction again. For this reason
it is not recommended. 

> Conclusions:
> 
> - do not use collections
> - do not use secondary indexes
> - do not use filtering
> - have your rows as narrow as possible if you need any kind of all row keys traversal
These features all have a use, but it looks like you leaned on them heavily while creating
a relational model. Specially the filtering, you have to explicitly enable  it to prevent
the client sending queries that will take a long time. 
The only time row key traversal is used normally is reading data through hadoop. You should
always strive to read row(s) from a table by  the full or partial primary key. 

> With these conclusions in mind, CQL seems redundant, plain old thrift may be used, joins
should be done client side and/or all indexes need to be handled manually. Correct?
No.
CQL provide a set of functionality not present in the thrift API. 
Joins and indexes should generally be handled by denormlaising the data during writes. 

It sounds like your data model was too relational, you need to denormalise and read rows by
primary key. Secondary indexes are useful when you have a query pattern that is used infrequently.


Hope that helps. 


-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 18/12/2013, at 3:47 am, Ondřej Černoš <cernoso@gmail.com> wrote:

> Hi all,
> 
> we are reimplementing a legacy interface of an inventory-like service (currently built
on top of mysql) on Cassandra and I thought I would share some findings with the list. The
interface semantics is given and cannot be changed. We chose Cassandra due to its multiple
datacenter capabilities and no-spof qualities. The dataset is small (6 tables having 150.000
records, a bunch of tables with up to thousands of records), but the model is not trivial
- the mysql model has some 20+ tables, joins are frequent, m:n relationships are frequent
and the like. The interface is read heavy. We thought the size of the dataset should allow
the whole dataset to fit into memory of each node (3 node cluster in each DC, 3 replicas,
local quorum operations) and that even though some operations (like secondary index lookup)
are not superfast, due to the size of the dataset it should perform ok. We were wrong.
> 
> We use CQL3 exclusively and we use all of its capabilities (collections, secondary indexes,
filtering), because they make the data model maintainable. We denormalised what had to be
denormalised in order to avoid client side joins. Usual query to the storage means one CQL
query on a denormalised table. We need to support integer offset/limit paging, filter-by-example
kind of queries, M:N relationship queries and all the usual suspects of old SQL-backed interface.
> 
> This is the list of operations that perform really poorly we identified so far. Row id
is called id in the following:
> 
> * select id from table where token(id) > token(some_value) and secondary_index = other_val
limit 2 allow filtering;
> 
> Filtering absolutely kills the performance. On a table populated with 130.000 records,
single node Cassandra server (on my i7 notebook, 2GB of JVM heap) and secondary index built
on column with low cardinality of its value set this query takes 156 seconds to finish.
> 
> By the way, the performance is order of magnitude better if this patch is applied:
> 
> diff --git a/src/java/org/apache/cassandra/db/index/composites/CompositesSearcher.java
b/src/java/org/apache/cassandra/db/index/composites/CompositesSearcher.java
> index 5ab1df6..13af671 100644
> --- a/src/java/org/apache/cassandra/db/index/composites/CompositesSearcher.java
> +++ b/src/java/org/apache/cassandra/db/index/composites/CompositesSearcher.java
> @@ -190,7 +190,8 @@ public class CompositesSearcher extends SecondaryIndexSearcher
> 
>             private int meanColumns = Math.max(index.getIndexCfs().getMeanColumns(),
1);
>             // We shouldn't fetch only 1 row as this provides buggy paging in case the
first row doesn't satisfy all clauses
> -            private final int rowsPerQuery = Math.max(Math.min(filter.maxRows(), filter.maxColumns()
/ meanColumns), 2);
> +//            private final int rowsPerQuery = Math.max(Math.min(filter.maxRows(), filter.maxColumns()
/ meanColumns), 2);
> +            private final int rowsPerQuery = 100000;
> 
>             public boolean needsFiltering()
>             {
> 
> * select id from table;
> 
> As we saw in the trace log, the query - although it queries just row ids - scans all
columns of all the rows and (probably) compares TTL with current time (?) (we saw hundreds
of thousands of gettimeofday(2)). This means that if the table somehow mixes wide and narrow
rows, the performance suffers horribly.
> 
> * CQL collections
> 
> See the point above with mixing wide rows and narrow rows. As Cassandra checks all the
columns in selects, performance suffers badly if the collection is of any interesting size.
> 
> Additionally, we saw various random irreproducible freezes, high CPU consumption when
nothing happens (even with trace log level set no activity was reported) and highly inpredictable
performance characteristics after nodetool flush and/or major compaction.
> 
> Conclusions:
> 
> - do not use collections
> - do not use secondary indexes
> - do not use filtering
> - have your rows as narrow as possible if you need any kind of all row keys traversal
> 
> With these conclusions in mind, CQL seems redundant, plain old thrift may be used, joins
should be done client side and/or all indexes need to be handled manually. Correct?
> 
> Thanks for reading,
> 
> ondrej cernos


Mime
View raw message