cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincent White (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13127) Materialized Views: View row expires too soon
Date Wed, 25 Jan 2017 07:59:26 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15837348#comment-15837348
] 

Vincent White commented on CASSANDRA-13127:
-------------------------------------------

I think I pretty much understand what's happening here. It basically all stems from the base
upsert behaviour  (creating a row via {{UPDATE}} so the primary key columns don't exist on
their own vs {{INSERT}}). I'm still not sure it matches the MV docs though and the comments
in the code say things like:
{code}1) either the columns for the base and view PKs are exactly the same: in that case,
the view entry should live as long as the base row lives. This means the view entry should
only expire once *everything* in the base row has expired. Which means the row TTL should
be the max of any other TTL.{code} I think the logic in {{computeLivenessInfoForEntry}} doesn't
make sense for updates because it only ever expected inserts. It leads to some funky behaviour
if you're mixing updates, inserts and TTL's. I didn't test with deletes but I guees they could
cause similar results.

Simply patching computeLivenessInfoForEntry like:
{code:title=ViewUpdateGenerator.java#computeLivenessInfoForEntry}
            int expirationTime = baseLiveness.localExpirationTime();
            for (Cell cell : baseRow.cells())
            {

-                if (cell.ttl() > ttl)
+                if (cell.localDeletionTime() > expirationTime)
                {
                    ttl = cell.ttl();
                    expirationTime = cell.localDeletionTime();
                }
            }
-            return ttl == baseLiveness.ttl()
+            return expirationTime == baseLiveness.localExpirationTime()
                 ? baseLiveness
                 : LivenessInfo.withExpirationTime(baseLiveness.timestamp(), ttl, expirationTime);
        }
{code} isn't enough because it leads to further unexpected behaviour where update statements
will resurrect previously TTL'd MV entries in some cases. If an update statement sets a column
that could cause the update of _any_ view in that keyspace it will resurrect entries in views
that have PK's made up of only columns from the base PK, regardless of whether the statement
updates non-PK columns in that view. If the update statement only sets values of columns that
don't appear in the keyspace's MV's then no MV TTL'd entries for that PK will be resurrected.
If there was never an entry in the MV for that MV PK then it won't create a a new one. This
is because upserts don't create new MV entries unless they set the value of a non-PK column
in that view (with or without this patch).

I don't think I've seen it referenced anywhere but is that intended behaviour when using upserts
and materialized views? That an {{UPDATE}} to a column not in a view will not create an entry
in an MV if the veiw's PK is only made up of columns from the base table's PK, but the matching
{{INSERT}} statement will?

> Materialized Views: View row expires too soon
> ---------------------------------------------
>
>                 Key: CASSANDRA-13127
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13127
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Duarte Nunes
>
> Consider the following commands, ran against trunk:
> {code}
> echo "DROP MATERIALIZED VIEW ks.mv; DROP TABLE ks.base;" | bin/cqlsh
> echo "CREATE TABLE ks.base (p int, c int, v int, PRIMARY KEY (p, c));" | bin/cqlsh
> echo "CREATE MATERIALIZED VIEW ks.mv AS SELECT p, c FROM base WHERE p IS NOT NULL AND
c IS NOT NULL PRIMARY KEY (c, p);" | bin/cqlsh
> echo "INSERT INTO ks.base (p, c) VALUES (0, 0) USING TTL 10;" | bin/cqlsh
> # wait for row liveness to get closer to expiration
> sleep 6;
> echo "UPDATE ks.base USING TTL 8 SET v = 0 WHERE p = 0 and c = 0;" | bin/cqlsh
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+--------
>  0 | 0 |      7
> (1 rows)
>  c | p
> ---+---
>  0 | 0
> (1 rows)
> # wait for row liveness to expire
> sleep 4;
> echo "SELECT p, c, ttl(v) FROM ks.base; SELECT * FROM ks.mv;" | bin/cqlsh
>  p | c | ttl(v)
> ---+---+--------
>  0 | 0 |      3
> (1 rows)
>  c | p
> ---+---
> (0 rows)
> {code}
> Notice how the view row is removed even though the base row is still live. I would say
this is because in ViewUpdateGenerator#computeLivenessInfoForEntry the TTLs are compared instead
of the expiration times, but I'm not sure I'm getting that far ahead in the code when updating
a column that's not in the view.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message