cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: No columns are defined for Materialized View other than primary key
Date Thu, 07 Sep 2017 15:46:16 GMT
"As I described, non-filtered full scans on MV are more efficient than
filtered full scans on a table"

--> But if your MV has the same primary key as your view, how can it be
possible ?

Can you elaborate on what you mean by "non filtered full scan on MV" ?
Please give us some sample SELECT queries

On Thu, Sep 7, 2017 at 5:11 PM, Alex Kotelnikov <
alex.kotelnikov@diginetica.com> wrote:

> In this example all tables and materialized views share all columns. What
> is the question?
>
> On 7 September 2017 at 17:26, sha p <shatesttest@gmail.com> wrote:
>
>> There is one more column "data" here in MView?
>>
>> On 7 Sep 2017 7:49 p.m., "DuyHai Doan" <doanduyhai@gmail.com> wrote:
>>
>>> The answer of your question is in the error message. For once it's very
>>> clear. The primary key of your materialized view is EXACTLY the same as for
>>> your base table.
>>>
>>> So the question is what's the point creating this materialized view ...
>>>
>>>
>>>
>>> On Thu, Sep 7, 2017 at 4:01 PM, Alex Kotelnikov <
>>> alex.kotelnikov@diginetica.com> wrote:
>>>
>>>> Hey. I have a problem creating a materialized view.
>>>>
>>>> My case is quite similar to
>>>> https://issues.apache.org/jira/browse/CASSANDRA-13564
>>>> but discussion in comments there faded, let me describe by case.
>>>>
>>>> I have a table like
>>>> CREATE TABLE users (
>>>>   site_id int,
>>>>   user_id text,
>>>>   n int,
>>>>   data set<frozen<text>>,
>>>>   PRIMARY KEY ((site_id, user_id), n));
>>>>
>>>> user data is updated and read by PK and sometimes I have to fetch all
>>>> user for some specific site_id. It appeared that full scan by
>>>> token(site_id,user_id) filtered by WHERE site_id = <some id> works
much
>>>> slower than unfiltered full scan on
>>>> CREATE MATERIALIZED VIEW users_1 AS
>>>> SELECT site_id, user_id, n, data
>>>> FROM users
>>>> WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL
>>>> PRIMARY KEY ((site_id, user_id), n);
>>>>
>>>> yes, you have to do so for each site_id, but it makes such bulk fetches
>>>> much faster. (When I do so, I am always puzzled, why I have to put NOT NULL
>>>> for a part of a primary key).
>>>> And just in case, I tried secondary indices on site_id. For such use
>>>> they improve nothing.
>>>>
>>>>
>>>> But things are changing and we realized that we want to get rid of
>>>> clustering key, n.
>>>>
>>>> DROP MATERIALIZED VIEW users_1;
>>>> DROP TABLE users;
>>>>
>>>> CREATE TABLE users (
>>>> site_id int,
>>>> user_id text,
>>>> data set<text>,
>>>> PRIMARY KEY ((site_id, user_id)));
>>>>
>>>> CREATE MATERIALIZED VIEW users_1 AS
>>>> SELECT site_id, user_id, data
>>>> FROM users
>>>> WHERE site_id = 1 AND user_id IS NOT NULL
>>>> PRIMARY KEY ((site_id, user_id));
>>>>
>>>> And here I get the error I listed in the subject.
>>>> InvalidRequest: Error from server: code=2200 [Invalid query]
>>>> message="No columns are defined for Materialized View other than primary
>>>> key"
>>>>
>>>> But why? I still expect scans to be faster with MV. It appears to be
>>>> possible to create a dummy column and using as a clustering key. That's
>>>> ugly.
>>>> --
>>>>
>>>> Best Regards,
>>>>
>>>>
>>>> *Alexander Kotelnikov*
>>>>
>>>> *Team Lead*
>>>>
>>>> DIGINETICA
>>>> Retail Technology Company
>>>>
>>>> m: +7.921.915.06.28 <+7%20921%20915-06-28>
>>>>
>>>> *www.diginetica.com <http://www.diginetica.com/>*
>>>>
>>>
>>>
>
>
> --
>
> Best Regards,
>
>
> *Alexander Kotelnikov*
>
> *Team Lead*
>
> DIGINETICA
> Retail Technology Company
>
> m: +7.921.915.06.28 <+7%20921%20915-06-28>
>
> *www.diginetica.com <http://www.diginetica.com/>*
>

Mime
View raw message