cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Kotelnikov <alex.kotelni...@diginetica.com>
Subject Re: No columns are defined for Materialized View other than primary key
Date Thu, 07 Sep 2017 15:11:19 GMT
In this example all tables and materialized views share all columns. What
is the question?

On 7 September 2017 at 17:26, sha p <shatesttest@gmail.com> wrote:

> There is one more column "data" here in MView?
>
> On 7 Sep 2017 7:49 p.m., "DuyHai Doan" <doanduyhai@gmail.com> wrote:
>
>> The answer of your question is in the error message. For once it's very
>> clear. The primary key of your materialized view is EXACTLY the same as for
>> your base table.
>>
>> So the question is what's the point creating this materialized view ...
>>
>>
>>
>> On Thu, Sep 7, 2017 at 4:01 PM, Alex Kotelnikov <
>> alex.kotelnikov@diginetica.com> wrote:
>>
>>> Hey. I have a problem creating a materialized view.
>>>
>>> My case is quite similar to
>>> https://issues.apache.org/jira/browse/CASSANDRA-13564
>>> but discussion in comments there faded, let me describe by case.
>>>
>>> I have a table like
>>> CREATE TABLE users (
>>>   site_id int,
>>>   user_id text,
>>>   n int,
>>>   data set<frozen<text>>,
>>>   PRIMARY KEY ((site_id, user_id), n));
>>>
>>> user data is updated and read by PK and sometimes I have to fetch all
>>> user for some specific site_id. It appeared that full scan by
>>> token(site_id,user_id) filtered by WHERE site_id = <some id> works much
>>> slower than unfiltered full scan on
>>> CREATE MATERIALIZED VIEW users_1 AS
>>> SELECT site_id, user_id, n, data
>>> FROM users
>>> WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL
>>> PRIMARY KEY ((site_id, user_id), n);
>>>
>>> yes, you have to do so for each site_id, but it makes such bulk fetches
>>> much faster. (When I do so, I am always puzzled, why I have to put NOT NULL
>>> for a part of a primary key).
>>> And just in case, I tried secondary indices on site_id. For such use
>>> they improve nothing.
>>>
>>>
>>> But things are changing and we realized that we want to get rid of
>>> clustering key, n.
>>>
>>> DROP MATERIALIZED VIEW users_1;
>>> DROP TABLE users;
>>>
>>> CREATE TABLE users (
>>> site_id int,
>>> user_id text,
>>> data set<text>,
>>> PRIMARY KEY ((site_id, user_id)));
>>>
>>> CREATE MATERIALIZED VIEW users_1 AS
>>> SELECT site_id, user_id, data
>>> FROM users
>>> WHERE site_id = 1 AND user_id IS NOT NULL
>>> PRIMARY KEY ((site_id, user_id));
>>>
>>> And here I get the error I listed in the subject.
>>> InvalidRequest: Error from server: code=2200 [Invalid query] message="No
>>> columns are defined for Materialized View other than primary key"
>>>
>>> But why? I still expect scans to be faster with MV. It appears to be
>>> possible to create a dummy column and using as a clustering key. That's
>>> ugly.
>>> --
>>>
>>> Best Regards,
>>>
>>>
>>> *Alexander Kotelnikov*
>>>
>>> *Team Lead*
>>>
>>> DIGINETICA
>>> Retail Technology Company
>>>
>>> m: +7.921.915.06.28 <+7%20921%20915-06-28>
>>>
>>> *www.diginetica.com <http://www.diginetica.com/>*
>>>
>>
>>


-- 

Best Regards,


*Alexander Kotelnikov*

*Team Lead*

DIGINETICA
Retail Technology Company

m: +7.921.915.06.28

*www.diginetica.com <http://www.diginetica.com/>*

Mime
View raw message