Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 86495108A0 for ; Fri, 14 Mar 2014 04:08:52 +0000 (UTC) Received: (qmail 97211 invoked by uid 500); 14 Mar 2014 04:08:49 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 97175 invoked by uid 500); 14 Mar 2014 04:08:49 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 97106 invoked by uid 99); 14 Mar 2014 04:08:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Mar 2014 04:08:46 +0000 Date: Fri, 14 Mar 2014 04:08:46 +0000 (UTC) From: "Jeremiah Jordan (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (CASSANDRA-6586) Cassandra touches all columns on CQL3 select MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan resolved CASSANDRA-6586. ---------------------------------------- Resolution: Duplicate > Cassandra touches all columns on CQL3 select > -------------------------------------------- > > Key: CASSANDRA-6586 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6586 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jan Chochol > Priority: Minor > > It seems that Cassandra is checking (garbage collecting) all columns of all returned rows, despite the fact that not all columns are requested. > Example: > * use following script to fill Cassandra with test data: > {noformat} > perl -e "print(\"DROP KEYSPACE t;\nCREATE KEYSPACE t WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 1};\nuse t;\nCREATE TABLE t (a varchar PRIMARY KEY, b varchar, c varchar, d varchar);\nCREATE INDEX t_b ON t (b);\nCREATE INDEX t_c ON t (c);\nCREATE INDEX t_d ON t (d);\n\");\$max = 200; for(\$i = 0; \$i < \$max; \$i++) { \$j = int(\$i * 10 / \$max); \$k = int(\$i * 100 / \$max); print(\"INSERT INTO t (a, b, c, d) VALUES ('a\$i', 'b\$j', 'c\$k', 'd\$i');\n\")}\n" | cqlsh > {noformat} > * turn on {{ALL}} logging for Cassandra > * issue this query: > {noformat} > select a from t where c = 'c1'; > {noformat} > This is result: > {noformat} > [root@jch3-devel:~/c4] cqlsh --no-color > Connected to C4 Cluster Single at localhost:9160. > [cqlsh 3.1.7 | Cassandra 1.2.11-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.36.1] > Use HELP for help. > cqlsh> use t; > cqlsh:t> select a from t where c = 'c1'; > a > ---- > a3 > a2 > {noformat} > From Cassandra log: > {noformat} > 2014-01-15 09:14:56.663+0100 [Thrift:1] [TRACE] QueryProcessor.java(125) org.apache.cassandra.cql3.QueryProcessor: component=c4 Process org.apache.cassandra.cql3.statements.SelectStatement@614b3189 @CL.ONE > 2014-01-15 09:14:56.810+0100 [Thrift:1] [TRACE] ReadCallback.java(67) org.apache.cassandra.service.ReadCallback: component=c4 Blockfor is 1; setting up requests to /127.0.0.1 > 2014-01-15 09:14:56.816+0100 [ReadStage:2] [DEBUG] CompositesSearcher.java(112) org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 Most-selective indexed predicate is 't.c EQ c1' > 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] ColumnFamilyStore.java(1493) org.apache.cassandra.db.ColumnFamilyStore: component=c4 Filtering org.apache.cassandra.db.index.composites.CompositesSearcher$1@e15911 for rows matching org.apache.cassandra.db.filter.ExtendedFilter$FilterWithCompositeClauses@4a9e6b8a > 2014-01-15 09:14:56.817+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(237) org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 Scanning index 't.c EQ c1' starting with > 2014-01-15 09:14:56.820+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t.t_c-ic-1, 6331) -> org.apache.cassandra.db.RowIndexEntry@66a6574b > 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 10000: 6133:false:0@1389773577394000 > 2014-01-15 09:14:56.821+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 10000: 6132:false:0@1389773577391000 > 2014-01-15 09:14:56.822+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(313) org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 Adding index hit to current row for 6133 > 2014-01-15 09:14:56.825+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6133) -> org.apache.cassandra.db.RowIndexEntry@32ad3193 > 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 2147483647: :false:0@1389773577394000 > 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 2147483647: b:false:2@1389773577394000 > 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 2147483647: c:false:2@1389773577394000 > 2014-01-15 09:14:56.826+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 2147483647: d:false:2@1389773577394000 > 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(313) org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 Adding index hit to current row for 6132 > 2014-01-15 09:14:56.828+0100 [ReadStage:2] [TRACE] SSTableReader.java(776) org.apache.cassandra.io.sstable.SSTableReader: component=c4 Adding cache entry for KeyCacheKey(/mnt/ebs/cassandra/data/t/t/t-t-ic-1, 6132) -> org.apache.cassandra.db.RowIndexEntry@87d66d5 > 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 0 of 2147483647: :false:0@1389773577391000 > 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 2147483647: b:false:2@1389773577391000 > 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 2147483647: c:false:2@1389773577391000 > 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] SliceQueryFilter.java(164) org.apache.cassandra.db.filter.SliceQueryFilter: component=c4 collecting 1 of 2147483647: d:false:2@1389773577391000 > 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(232) org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 Read only 2 (< 10000) last page through, must be done > 2014-01-15 09:14:56.829+0100 [ReadStage:2] [TRACE] CompositesSearcher.java(232) org.apache.cassandra.db.index.composites.CompositesSearcher: component=c4 Read only 2 (< 10000) last page through, must be done > 2014-01-15 09:14:56.830+0100 [Thrift:1] [DEBUG] Tracing.java(169) org.apache.cassandra.tracing.Tracing: component=c4 request complete > {noformat} > Note that Cassandra is checking all columns ({{a}}, {{b}}, {{c}} and {{d}}), even we requested only column {{a}}. > Things became really nasty, when using lots of columns, or bigger collections (yes - each member of collection is checked). > This is quite counter intuitive behaviour, as all Cassandra guides said, that using wide rows should not affect performance, but in CQL3 they create big performance bottleneck. -- This message was sent by Atlassian JIRA (v6.2#6252)