"As I described, non-filtered full scans on MV are more efficient than filtered full scans on a table"

--> But if your MV has the same primary key as your view, how can it be possible ?

Can you elaborate on what you mean by "non filtered full scan on MV" ? Please give us some sample SELECT queries

On Thu, Sep 7, 2017 at 5:11 PM, Alex Kotelnikov <alex.kotelnikov@diginetica.com> wrote:
In this example all tables and materialized views share all columns. What is the question?

On 7 September 2017 at 17:26, sha p <shatesttest@gmail.com> wrote:
There is one more column "data" here in MView?

On 7 Sep 2017 7:49 p.m., "DuyHai Doan" <doanduyhai@gmail.com> wrote:
The answer of your question is in the error message. For once it's very clear. The primary key of your materialized view is EXACTLY the same as for your base table. 

So the question is what's the point creating this materialized view ...



On Thu, Sep 7, 2017 at 4:01 PM, Alex Kotelnikov <alex.kotelnikov@diginetica.com> wrote:
Hey. I have a problem creating a materialized view.

My case is quite similar to
https://issues.apache.org/jira/browse/CASSANDRA-13564
but discussion in comments there faded, let me describe by case.

I have a table like
CREATE TABLE users (
  site_id int,
  user_id text,
  n int,
  data set<frozen<text>>,
  PRIMARY KEY ((site_id, user_id), n));

user data is updated and read by PK and sometimes I have to fetch all user for some specific site_id. It appeared that full scan by token(site_id,user_id) filtered by WHERE site_id = <some id> works much slower than unfiltered full scan on
CREATE MATERIALIZED VIEW users_1 AS
SELECT site_id, user_id, n, data
FROM users
WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL
PRIMARY KEY ((site_id, user_id), n);

yes, you have to do so for each site_id, but it makes such bulk fetches much faster. (When I do so, I am always puzzled, why I have to put NOT NULL for a part of a primary key).
And just in case, I tried secondary indices on site_id. For such use they improve nothing.


But things are changing and we realized that we want to get rid of clustering key, n.

DROP MATERIALIZED VIEW users_1;
DROP TABLE users;

CREATE TABLE users (
site_id int,
user_id text,
data set<text>,
PRIMARY KEY ((site_id, user_id)));

CREATE MATERIALIZED VIEW users_1 AS
SELECT site_id, user_id, data
FROM users
WHERE site_id = 1 AND user_id IS NOT NULL 
PRIMARY KEY ((site_id, user_id));

And here I get the error I listed in the subject. 
InvalidRequest: Error from server: code=2200 [Invalid query] message="No columns are defined for Materialized View other than primary key"

But why? I still expect scans to be faster with MV. It appears to be possible to create a dummy column and using as a clustering key. That's ugly.
--

Best Regards,

Alexander Kotelnikov
Team Lead

DIGINETICA

Retail Technology Company

m: +7.921.915.06.28

www.diginetica.com





--

Best Regards,

Alexander Kotelnikov
Team Lead

DIGINETICA

Retail Technology Company

m: +7.921.915.06.28

www.diginetica.com