cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Kotelnikov <alex.kotelni...@diginetica.com>
Subject Re: No columns are defined for Materialized View other than primary key
Date Thu, 07 Sep 2017 15:09:46 GMT
As I described, non-filtered full scans on MV are more efficient than
filtered full scans on a table.

On 7 September 2017 at 17:19, DuyHai Doan <doanduyhai@gmail.com> wrote:

> The answer of your question is in the error message. For once it's very
> clear. The primary key of your materialized view is EXACTLY the same as for
> your base table.
>
> So the question is what's the point creating this materialized view ...
>
>
>
> On Thu, Sep 7, 2017 at 4:01 PM, Alex Kotelnikov <
> alex.kotelnikov@diginetica.com> wrote:
>
>> Hey. I have a problem creating a materialized view.
>>
>> My case is quite similar to
>> https://issues.apache.org/jira/browse/CASSANDRA-13564
>> but discussion in comments there faded, let me describe by case.
>>
>> I have a table like
>> CREATE TABLE users (
>>   site_id int,
>>   user_id text,
>>   n int,
>>   data set<frozen<text>>,
>>   PRIMARY KEY ((site_id, user_id), n));
>>
>> user data is updated and read by PK and sometimes I have to fetch all
>> user for some specific site_id. It appeared that full scan by
>> token(site_id,user_id) filtered by WHERE site_id = <some id> works much
>> slower than unfiltered full scan on
>> CREATE MATERIALIZED VIEW users_1 AS
>> SELECT site_id, user_id, n, data
>> FROM users
>> WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL
>> PRIMARY KEY ((site_id, user_id), n);
>>
>> yes, you have to do so for each site_id, but it makes such bulk fetches
>> much faster. (When I do so, I am always puzzled, why I have to put NOT NULL
>> for a part of a primary key).
>> And just in case, I tried secondary indices on site_id. For such use they
>> improve nothing.
>>
>>
>> But things are changing and we realized that we want to get rid of
>> clustering key, n.
>>
>> DROP MATERIALIZED VIEW users_1;
>> DROP TABLE users;
>>
>> CREATE TABLE users (
>> site_id int,
>> user_id text,
>> data set<text>,
>> PRIMARY KEY ((site_id, user_id)));
>>
>> CREATE MATERIALIZED VIEW users_1 AS
>> SELECT site_id, user_id, data
>> FROM users
>> WHERE site_id = 1 AND user_id IS NOT NULL
>> PRIMARY KEY ((site_id, user_id));
>>
>> And here I get the error I listed in the subject.
>> InvalidRequest: Error from server: code=2200 [Invalid query] message="No
>> columns are defined for Materialized View other than primary key"
>>
>> But why? I still expect scans to be faster with MV. It appears to be
>> possible to create a dummy column and using as a clustering key. That's
>> ugly.
>> --
>>
>> Best Regards,
>>
>>
>> *Alexander Kotelnikov*
>>
>> *Team Lead*
>>
>> DIGINETICA
>> Retail Technology Company
>>
>> m: +7.921.915.06.28 <+7%20921%20915-06-28>
>>
>> *www.diginetica.com <http://www.diginetica.com/>*
>>
>
>


-- 

Best Regards,


*Alexander Kotelnikov*

*Team Lead*

DIGINETICA
Retail Technology Company

m: +7.921.915.06.28

*www.diginetica.com <http://www.diginetica.com/>*

Mime
View raw message