cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7844) Fetching a single static column requires scanning to the first live CQL row
Date Thu, 28 Aug 2014 07:18:58 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113463#comment-14113463
] 

Sylvain Lebresne commented on CASSANDRA-7844:
---------------------------------------------

tl;dr use {{DISTINCT}} ({{SELECT DISTINCT next_id FROM friends WHERE user='user1'}}).

If you do a query like {{SELECT next_id FROM friends WHERE user='user1'}}, we will query the
whole partition the same way we would do it if the partition key was the only thing selected
(and you'll have to use {{DISTINCT}} in that case too to avoid it).  See CASSANDRA-7305 for
background.

Now you may ask: ok, but I have a {{LIMIT 1}} in my query, so why does it scan the whole 6
tombstoned cells. Good question. That part lies in how we count row in {{SliceQueryFilter}}.
Because cells are not grouped in CQL rows in the underlying storage engine, we have to decompose
cell names and make sure we only count 1 for all the cells pertaining to the same CQL row.
Since we only know that we've gathered all the cells of a particular CQL row when we get a
cell from another CQL row, the stopping condition is {{columnCounter.live() > limit}}.
In other words, with {{LIMIT 1}}, we stop as soon as we've seen the first cell of the 2nd
live CQL row.  But tombstones don't count towards the live count, so if you have 1 live CQL
row following by scores of tombstones, we will read all the tombstones, because they are not
"the first cell of the 2nd *live* row". I'll note that all this is not related to static columns
in any way. It would be possible to change the logic to make a difference between "I've started
counting the Xth row but there may still have more cells for that CQL row to add" and "I've
counted X row but I've seen a tombstone that don't belong to the Xth row so if my limit is
X I'm done". That said, it would probably be a bit ugly, and that would force us to decompose
the cell names of tombstones which we currently don't do, so while it would improve what is
a fairly degenerate case (remember that the solution to the example given in this ticket is
to use {{DISTINCT}}), it would add a small overhead in general. Overall, I'm tempted to leave
things as they are: we're hoping to make the storage engine a bit more aware of CQL rows in
the future, and that will probably make that kind of things go away without requiring ugly
special casing.


> Fetching a single static column requires scanning to the first live CQL row
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-7844
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7844
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Nicolas Favre-Felix
>              Labels: perfomance
>
> Reading a single static column seems to do more work than needed, scanning the partition
until the first live CQL row before returning a value.
> As I understand, static columns are stored separately from clustered rows (see CASSANDRA-6956
for an issue that arised from this storage model). Nevertheless, Cassandra doesn't optimize
for the case where only static columns and partition key dimensions are retrieved.
> Selecting a static column on its own is possible:
> {code}
> > create table friends (user text, next_id int static, friend_id int, email text,
primary key(user,friend_id));
> > insert into friends (user, next_id) values ('user1', 1);
> > select * from friends where user = 'user1';
>  user  | friend_id | next_id | email
> -------+-----------+---------+-------
>  user1 |      null |       1 |  null
> (1 rows)
> {code}
> Let's insert and delete some clustered data:
> {code}
> > insert into friends (user, next_id, friend_id, email) values ('user1', 2, 1, 'abc@foo');
> > insert into friends (user, next_id, friend_id, email) values ('user1', 3, 2, 'def@foo');
> > insert into friends (user, next_id, friend_id, email) values ('user1', 4, 3, 'ghi@foo');
> > select * from friends where user = 'user1';
>  user  | friend_id | next_id | email
> -------+-----------+---------+---------
>  user1 |         1 |       4 | abc@foo
>  user1 |         2 |       4 | def@foo
>  user1 |         3 |       4 | ghi@foo
> (3 rows)
> > delete from friends where user = 'user1' and friend_id = 1;
> > delete from friends where user = 'user1' and friend_id = 2;
> > delete from friends where user = 'user1' and friend_id = 3;
> {code}
> And then fetch the static column again:
> {code}
> > TRACING ON
> Now tracing requests.
> > select next_id from friends where user = 'user1' limit 1;
>  next_id
> ---------
>        4
> (1 rows)
> Tracing session: 597cc970-2e27-11e4-932f-c551d8e65d14
>  activity                                                                  | timestamp
   | source    | source_elapsed
> ---------------------------------------------------------------------------+--------------+-----------+----------------
>                                                         execute_cql3_query | 13:18:46,792
| 127.0.0.1 |              0
>          Parsing SELECT next_id from friends where user = 'user1' LIMIT 1; | 13:18:46,792
| 127.0.0.1 |             59
>                                                        Preparing statement | 13:18:46,792
| 127.0.0.1 |            125
>                                Executing single-partition query on friends | 13:18:46,792
| 127.0.0.1 |            357
>                                               Acquiring sstable references | 13:18:46,792
| 127.0.0.1 |            369
>                                                Merging memtable tombstones | 13:18:46,792
| 127.0.0.1 |            381
>  Skipped 0/0 non-slice-intersecting sstables, included 0 due to tombstones | 13:18:46,792
| 127.0.0.1 |            445
>                                 Merging data from memtables and 0 sstables | 13:18:46,792
| 127.0.0.1 |            460
>                                         Read 1 live and 6 tombstoned cells | 13:18:46,792
| 127.0.0.1 |            504
>                                                           Request complete | 13:18:46,792
| 127.0.0.1 |            711
> {code}
>     
>     
> We went over tombstones instead of returning the static column immediately.
> Is this possibly related to CASSANDRA-7085?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message