cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From " Brian Hess (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
Date Fri, 17 Jul 2015 22:01:13 GMT


 Brian Hess commented on CASSANDRA-6477:

Let's go back to the basic use case that this is supposed to replace/help/make better.  The
case where we want two query tables for the same data.  That is, they have the same primary
keys, but different partition keys (and clustering column orders).

Today, I would do this by having a logged batch for the insert and that batch would insert
into each of the two query tables.  With this I get some data consistency guarantees.  For
example, if the client returns "success", I know that *both* of the inserts were accepted
at the desired consistency level.  So, if I did 2 writes at CL_QUORUM, and I receive a "success",
then I know I can then do a CL_QUORUM read of *either* table and see the most recent data.

However, with this "asynchronous" MV approach, I no longer get this behavior.  I write to
the base table at CL_QUORUM and get the "success" return.  At that point, I can do a CL_QUORUM
read from the base table and see the most recent insert.  However, if I do a CL_QUORUM read
from the MV, I have no guarantees at all.

This approach does not address the basic situation that we are trying to cover.  That concerns
me greatly.

> Materialized Views (was: Global Indexes)
> ----------------------------------------
>                 Key: CASSANDRA-6477
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Carl Yeksigian
>              Labels: cql
>             Fix For: 3.0 beta 1
>         Attachments:, users.yaml
> Local indexes are suitable for low-cardinality data, where spreading the index across
the cluster is a Good Thing.  However, for high-cardinality data, local indexes require querying
most nodes in the cluster even if only a handful of rows is returned.

This message was sent by Atlassian JIRA

View raw message