cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fridtjof Sander (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13657) Materialized Views: Index MV on TTL'ed column produces orphanized view entry if another column keeps entry live
Date Mon, 10 Jul 2017 11:14:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080177#comment-16080177
] 

Fridtjof Sander commented on CASSANDRA-13657:
---------------------------------------------

[~krishna.koneru] Thanks for the swift reaction. Looks good to me.

I don't want to appear unappreciative here, but also having [CASSANDRA-11500|https://issues.apache.org/jira/browse/CASSANDRA-11500]
in mind, I'm wondering if there isn't a more general solution to our consistency problems
of MVs. 

I drafted an idea and created kind of a POC here: https://github.com/f-sander/apache-cassandra/tree/short_circuiting_liveness_info

The general idea of it is below:

The usual "live" definition for a row is as follows:
{quote}
A row is live iff its liveness-info is live or any of its cells is live
{quote}
For rows of MVs the definition changes (for those moving a column into the MV's PK aka index
views)
{quote}
A MV row is live iff the indexed column's cell in the base row is live
{quote}
Instead of modeling this liveness dependency explicitly, we are trying to squeeze it into
the first definition. In my opinion, that's a source of complicated code, hence oversights,
hence bugs.

So, how would a explicit model of that dependency look like?

What we could do is to add a flag to the liveness-info that enables to short-circuit the live-definition.
The live-definition then changes to:
{quote}
A row is live iff its liveness-info is live or (if its liveness-info is not flagged AND if
any of its cells is live)
{quote}
In other words, the cells' "livenesses" are only considered if the parent row's liveness-info
is not flagged. Otherwise the liveness-info defines the whole row's liveness.

Then, the view-row's liveness-info can be set equivalent to the indexed cell's liveness (plus
the flag).

I have the impression this would greatly simplify the code and fix any problem with MV consistency
we currently have. 

> Materialized Views: Index MV on TTL'ed column produces orphanized view entry if another
column keeps entry live
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13657
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13657
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Materialized Views
>            Reporter: Fridtjof Sander
>            Assignee: Krishna Dattu Koneru
>              Labels: materializedviews, ttl
>
> {noformat}
> CREATE TABLE t (k int, a int, b int, PRIMARY KEY (k));
> CREATE MATERIALIZED VIEW mv AS SELECT * FROM t WHERE k IS NOT NULL AND a IS NOT NULL
PRIMARY KEY (a, k);
> INSERT INTO t (k) VALUES (1);
> UPDATE t USING TTL 5 SET a = 10 WHERE k = 1;
> UPDATE t SET b = 100 WHERE k = 1;
> SELECT * from t; SELECT * from mv;
>  k | a  | b
> ---+----+-----
>  1 | 10 | 100
> (1 rows)
>  a  | k | b
> ----+---+-----
>  10 | 1 | 100
> (1 rows)
> -- 5 seconds later
> SELECT * from t; SELECT * from mv;
>  k | a    | b
> ---+------+-----
>  1 | null | 100
> (1 rows)
>  a  | k | b
> ----+---+-----
>  10 | 1 | 100
> (1 rows)
> -- that view entry's liveness-info is (probably) dead, but the entry is kept alive by
b=100
> DELETE b FROM t WHERE k=1;
> SELECT * from t; SELECT * from mv;
>  k | a    | b
> ---+------+------
>  1 | null | null
> (1 rows)
>  a  | k | b
> ----+---+-----
>  10 | 1 | 100
> (1 rows)
> DELETE FROM t WHERE k=1;
> cqlsh:test> SELECT * from t; SELECT * from mv;
>  k | a | b
> ---+---+---
> (0 rows)
>  a  | k | b
> ----+---+-----
>  10 | 1 | 100
> (1 rows)
> -- deleting the base-entry doesn't help, because the view-key can not be constructed
anymore (a=10 already expired)
> {noformat}
> The problem here is that although the view-entry's liveness-info (probably) expired correctly
a regular column (`b`) keeps the view-entry live. It should have disappeared since it's indexed
column (`a`) expired in the corresponding base-row. This is pretty severe, since that view-entry
is now orphanized.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message