cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Chochol (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6586) Cassandra touches all columns on CQL3 select
Date Thu, 16 Jan 2014 08:07:22 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873142#comment-13873142
] 

Jan Chochol commented on CASSANDRA-6586:
----------------------------------------

{quote}
this ticket does not imply that all physical thrift columns (cells) of a physical thrift row
(partition) are read when you do a CQL select on a CQL primary key
{quote}
I want imply exactly this:
Every CQL select (regardless on CQL column names specified after {{SELECT}} and regardless
on {{WHERE}} conditions) reads whole CQL row (of returned rows).
That means, in case that primary key consists of only one part (partition key), than every
SQL select reads whole CQL partition (Thrift row) (of returned CQL rows), as in this case
CQL rows are equal to CQL partitions.

> Cassandra touches all columns on CQL3 select
> --------------------------------------------
>
>                 Key: CASSANDRA-6586
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6586
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jan Chochol
>            Priority: Minor
>
> It seems that Cassandra is checking (garbage collecting) all columns of all returned
rows, despite the fact that not all columns are requested.
> Example:
> * use following script to fill Cassandra with test data:
> {noformat}
> perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = {'class': 'SimpleStrategy',
'replication_factor' : 1};\nuse t;\nCREATE TABLE t (a varchar PRIMARY KEY, b varchar, c varchar,
d varchar);\nCREATE INDEX t_b ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t
(d);\n\");\$max = 200; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max);
\$k = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', 'b\$j', 'c\$k',
'd\$i');\n\")}\n" | cqlsh
> {noformat}
> * turn on {{ALL}} logging for Cassandra
> * issue this query:
> {noformat}
> select a from t where c = 'c1';
> {noformat}
> This is result:
> {noformat}
> [root@jch3-devel:~/c4] cqlsh --no-color
> Connected to C4 Cluster Single at localhost:9160.
> [cqlsh 3.1.7 | Cassandra 1.2.11-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.36.1]
> Use HELP for help.
> cqlsh> use t;
> cqlsh:t> select a from t where c = 'c1';
>  a
> ----
>  a3
>  a2
> {noformat}
> From Cassandra log:
> {noformat}
> 2014-01-15 09:14:56.663+0100 [Thrift:1] [TRACE] QueryProcessor.java(125) org.apache.cassandra.cql3.QueryProcessor:
component=c4 Process org.apache.cassandra.cql3.statements.SelectStatement@614b3189 @CL.ONE
> 2014-01-15 09:14:56.810+0100 [Thrift:1] [TRACE] ReadCallback.java(67) org.apache.cassandra.service.ReadCallback:
component=c4 Blockfor is 1; setting up requests to /127.0.0.1
> 2014-01-15 09:14:56.816+0100 [ReadStage:2] [DEBUG] CompositesSearcher.java(112) org.apache.cassandra.db.index.composites.CompositesSearcher:
component=c4 Most-selective indexed predicate is 't.c EQ c1'
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] ColumnFamilyStore.java(1493) org.apache.cassandra.db.ColumnFamilyStore:
component=c4 Filtering org.apache.cassandra.db.index.composites.CompositesSearcher$1@e15911
for rows matching org.apache.cassandra.db.filter.ExtendedFilter$FilterWithCompositeClauses@4a9e6b8a
> 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(237) org.apache.cassandra.db.index.composites.CompositesSearcher:
component=c4 Scanning index 't.c EQ c1' starting with 
> 2014-01-15 09:14:56.820+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) org.apache.cassandra.io.sstable.SSTableReader:
component=c4 Adding cache entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t.t_c-ic-1,
6331) -> org.apache.cassandra.db.RowIndexEntry@66a6574b
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 0 of 10000: 6133:false:0@1389773577394000
> 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 1 of 10000: 6132:false:0@1389773577391000
> 2014-01-15 09:14:56.822+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(313) org.apache.cassandra.db.index.composites.CompositesSearcher:
component=c4 Adding index hit to current row for 6133
> 2014-01-15 09:14:56.825+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) org.apache.cassandra.io.sstable.SSTableReader:
component=c4 Adding cache entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6133)
-> org.apache.cassandra.db.RowIndexEntry@32ad3193
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 0 of 2147483647: :false:0@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 1 of 2147483647: b:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 1 of 2147483647: c:false:2@1389773577394000
> 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 1 of 2147483647: d:false:2@1389773577394000
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(313) org.apache.cassandra.db.index.composites.CompositesSearcher:
component=c4 Adding index hit to current row for 6132
> 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) org.apache.cassandra.io.sstable.SSTableReader:
component=c4 Adding cache entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6132)
-> org.apache.cassandra.db.RowIndexEntry@87d66d5
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 0 of 2147483647: :false:0@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 1 of 2147483647: b:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 1 of 2147483647: c:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter:
component=c4 collecting 1 of 2147483647: d:false:2@1389773577391000
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(232) org.apache.cassandra.db.index.composites.CompositesSearcher:
component=c4 Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(232) org.apache.cassandra.db.index.composites.CompositesSearcher:
component=c4 Read only 2 (< 10000) last page through, must be done
> 2014-01-15 09:14:56.830+0100 [Thrift:1] [DEBUG] Tracing.java(169) org.apache.cassandra.tracing.Tracing:
component=c4 request complete
> {noformat}
> Note that Cassandra is checking all columns ({{a}}, {{b}}, {{c}} and {{d}}), even we
requested only column {{a}}.
> Things became really nasty, when using lots of columns, or bigger collections (yes -
each member of collection is checked).
> This is quite counter intuitive behaviour, as all Cassandra guides said, that using wide
rows should not affect performance, but in CQL3 they create big performance bottleneck.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message