cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-9664) Allow MV's select statements to be more complex
Date Mon, 31 Aug 2015 22:28:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724150#comment-14724150
] 

Tyler Hobbs commented on CASSANDRA-9664:
----------------------------------------

I've discovered one limitation of this: I don't think we will be able to support filtering
on regular columns in the base table due to timestamp issues.

Suppose we have a base table like this:

{noformat}
CREATE TABLE base (a int, b int, c int, d int, PRIMARY KEY (a, b));
{noformat}

and a view:

{noformat}
CREATE MATERIALIZED VIEW view AS
    SELECT * FROM base
    WHERE a IS NOT NULL
    AND b IS NOT NULL
    AND c = 1
    PRIMARY KEY (a, b)
{noformat}

The view should only contain rows where {{c = 1}}.

Now, suppose we do the following:

{noformat}
INSERT INTO base (a, b, c, d) VALUES (0, 0, 1, 0) USING TIMESTAMP 0;
UPDATE base SET c = 1 USING TIMESTAMP 2;
UPDATE base SET c = 0 USING TIMESTAMP 3;  -- view row should be deleted w/ timestamp 3
UPDATE base SET c = 1 USING TIMESTAMP 4;
{noformat}

The third update results in a tombstone being written to the view row with a timestamp of
3 because {{c}} no longer matches our SELECT statement.  The problem comes at the final update:
now that {{c}} matches the SELECT again, we should reinsert a row into the view.  For the
reinsertion, we can use timestamp 4 for the row marker and {{c}}'s cell, but what about {{d}}'s
cell?  If we use its timestamp from the base row (0), the view tombstone supercedes it, leaving
{{d}} as null.  If we use timestamp 4 for {{d}}'s cell, we may accidentally ignore updates
to {{d}}.  For example, this would update {{d}} in the base table but not the view:

{noformat}
UPDATE base SET d = 1 WHERE a = 0 AND b = 0 USING TIMESTAMP 1;
{noformat}

Note that the timestamp of 1 is higher than {{d}}'s timestamp in the base table (0) but lower
than {{d}}'s timestamp in the view (4).

In light of this, I suggest that for the first iteration of this we only allow the SELECT
statement to restrict PK columns.

> Allow MV's select statements to be more complex
> -----------------------------------------------
>
>                 Key: CASSANDRA-9664
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9664
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Carl Yeksigian
>             Fix For: 3.x
>
>
> [Materialized Views|https://issues.apache.org/jira/browse/CASSANDRA-6477] add support
for a syntax which includes a {{SELECT}} statement, but only allows selection of direct columns,
and does not allow any filtering to take place.
> We should add support to the MV {{SELECT}} statement to bring better parity with the
normal CQL {{SELECT}} statement, specifically simple functions in the selected columns, as
well as specifying a {{WHERE}} clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message