Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 47A6B200CF0 for ; Thu, 7 Sep 2017 17:10:37 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4753516105A; Thu, 7 Sep 2017 15:10:37 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 400FA160D30 for ; Thu, 7 Sep 2017 17:10:36 +0200 (CEST) Received: (qmail 47382 invoked by uid 500); 7 Sep 2017 15:10:34 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 47372 invoked by uid 99); 7 Sep 2017 15:10:34 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Sep 2017 15:10:34 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 00D63C0167 for ; Thu, 7 Sep 2017 15:10:34 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.491 X-Spam-Level: ** X-Spam-Status: No, score=2.491 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=diginetica-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id m3JRPHLOj2fi for ; Thu, 7 Sep 2017 15:10:29 +0000 (UTC) Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 8258D5F3CF for ; Thu, 7 Sep 2017 15:10:28 +0000 (UTC) Received: by mail-wm0-f46.google.com with SMTP id 137so192920wmj.1 for ; Thu, 07 Sep 2017 08:10:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=diginetica-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=+gHhWf9A1y4JqU6MtxFRjzyR56WGw4SWRPL5GKni8cQ=; b=K4b9zZs6b/GyRxUfbI+AwbbSk1S5fpoMao0Qy/zDbtdvreI+ZVWD7s6F/nPuB52y06 5ll2k7GT0hasdhRZ+08Hu7Witgo9KmS7vVNoPVY4ScvsQijNpPOAnMqLJbYKvSf9n4nT E0hPl2NOsxlrn90eZMhMlNx0QUPa9NWUH2DJcXloZNLqlymuiGb51OECWVTl8UlAw7Ku jRtbOKhTjrtDGshl48JX8OTrEitNiUUkDGMiEW9DR5odCvVJf5k8Qxia23lEDwR5ouJP NKc3FWne06c+tC/pEfT1hKpdpEj5ySWkJZPxfvnZicxvlX6LPNw52oITPTAnPrWTKNXz Edfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=+gHhWf9A1y4JqU6MtxFRjzyR56WGw4SWRPL5GKni8cQ=; b=WQJJiPjT/8+aGc+uSgs+hH06vUDD5eWe53A7dvuvcunS6td4CbVBJbvOUsnVtVkyDs olOJwtduUQ/10C5e0r/uVGIqfrq9476fpA9FJZy0wSa5hEU2c84i6zecd87qlSVBF3aj KayXFV2aggdr6H9/sSlTMt1sBu5txUFsmIhanTatuA37FHXNhbzkDTA+ihSszWMB9Dhz olYym87KnACLDETjksnJcgk84j1WGt5dF8SkDGdei14pW9o8eDGYCCwKeyUoiaK1lX+d 4kzO+QTJnLBqhihqgbQ4mw3xBJUVH9illduQUULSmM61LwkGZFmFz0otiEpEmmWRe2qM 5dqA== X-Gm-Message-State: AHPjjUhpJErOfd0oRDvo7D+k3nlHlilA77TLq9wiQOftJDwR0+ApSHRg coLlo9oW7oDE4NOHG8QwNVHGAkWYvH8gpFY= X-Google-Smtp-Source: ADKCNb4tmEHDwIK6ZeU8C2JhA9d0yzmVDNNMKP53oJLwFKTG7Eh/4bhGosAfKl3N+MAX1wDzUBoxVL17bHY6+yjVKCI= X-Received: by 10.28.11.195 with SMTP id 186mr744975wml.41.1504797027737; Thu, 07 Sep 2017 08:10:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.220.8 with HTTP; Thu, 7 Sep 2017 08:09:46 -0700 (PDT) In-Reply-To: References: From: Alex Kotelnikov Date: Thu, 7 Sep 2017 18:09:46 +0300 Message-ID: Subject: Re: No columns are defined for Materialized View other than primary key To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary="001a114422544c640005589adcbb" archived-at: Thu, 07 Sep 2017 15:10:37 -0000 --001a114422544c640005589adcbb Content-Type: text/plain; charset="UTF-8" As I described, non-filtered full scans on MV are more efficient than filtered full scans on a table. On 7 September 2017 at 17:19, DuyHai Doan wrote: > The answer of your question is in the error message. For once it's very > clear. The primary key of your materialized view is EXACTLY the same as for > your base table. > > So the question is what's the point creating this materialized view ... > > > > On Thu, Sep 7, 2017 at 4:01 PM, Alex Kotelnikov < > alex.kotelnikov@diginetica.com> wrote: > >> Hey. I have a problem creating a materialized view. >> >> My case is quite similar to >> https://issues.apache.org/jira/browse/CASSANDRA-13564 >> but discussion in comments there faded, let me describe by case. >> >> I have a table like >> CREATE TABLE users ( >> site_id int, >> user_id text, >> n int, >> data set>, >> PRIMARY KEY ((site_id, user_id), n)); >> >> user data is updated and read by PK and sometimes I have to fetch all >> user for some specific site_id. It appeared that full scan by >> token(site_id,user_id) filtered by WHERE site_id = works much >> slower than unfiltered full scan on >> CREATE MATERIALIZED VIEW users_1 AS >> SELECT site_id, user_id, n, data >> FROM users >> WHERE site_id = 1 AND user_id IS NOT NULL AND n IS NOT NULL >> PRIMARY KEY ((site_id, user_id), n); >> >> yes, you have to do so for each site_id, but it makes such bulk fetches >> much faster. (When I do so, I am always puzzled, why I have to put NOT NULL >> for a part of a primary key). >> And just in case, I tried secondary indices on site_id. For such use they >> improve nothing. >> >> >> But things are changing and we realized that we want to get rid of >> clustering key, n. >> >> DROP MATERIALIZED VIEW users_1; >> DROP TABLE users; >> >> CREATE TABLE users ( >> site_id int, >> user_id text, >> data set, >> PRIMARY KEY ((site_id, user_id))); >> >> CREATE MATERIALIZED VIEW users_1 AS >> SELECT site_id, user_id, data >> FROM users >> WHERE site_id = 1 AND user_id IS NOT NULL >> PRIMARY KEY ((site_id, user_id)); >> >> And here I get the error I listed in the subject. >> InvalidRequest: Error from server: code=2200 [Invalid query] message="No >> columns are defined for Materialized View other than primary key" >> >> But why? I still expect scans to be faster with MV. It appears to be >> possible to create a dummy column and using as a clustering key. That's >> ugly. >> -- >> >> Best Regards, >> >> >> *Alexander Kotelnikov* >> >> *Team Lead* >> >> DIGINETICA >> Retail Technology Company >> >> m: +7.921.915.06.28 <+7%20921%20915-06-28> >> >> *www.diginetica.com * >> > > -- Best Regards, *Alexander Kotelnikov* *Team Lead* DIGINETICA Retail Technology Company m: +7.921.915.06.28 *www.diginetica.com * --001a114422544c640005589adcbb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
As I described, non-filtered full scans on MV are more eff= icient than filtered full scans on a table.

<= div class=3D"gmail_quote">On 7 September 2017 at 17:19, DuyHai Doan <do= anduyhai@gmail.com> wrote:
=
The answer of your question is in the error message. For o= nce it's very clear. The primary key of your materialized view is EXACT= LY the same as for your base table.=C2=A0

So the questio= n is what's the point creating this materialized view ...

<= /div>


On Thu, Sep 7, 2017 = at 4:01 PM, Alex Kotelnikov <alex.kotelnikov@diginetica.= com> wrote:
Hey. I have a problem creating a materialized view.

My case is quite similar to
https://issues.apache.or= g/jira/browse/CASSANDRA-13564
but discussion in comments there= faded, let me describe by case.

I have a table li= ke
CREATE TABLE users (
=C2=A0 site_id int,
<= div>=C2=A0 user_id text,
=C2=A0 n int,
=C2=A0 data set&= lt;frozen<text>>,
=C2=A0 PRIMARY KEY ((site_id, user_id)= , n));

user data is updated and read by PK a= nd sometimes I have to fetch all user for some specific site_id. It appeare= d that full scan by token(site_id,user_id) filtered by WHERE site_id =3D &l= t;some id> works much slower than unfiltered full scan on
CREATE MATERIALIZED VIEW users_1 AS
SELECT site_id, user_id, n, = data
FROM users
WHERE site_id =3D 1 AND user_id IS NOT = NULL AND n IS NOT NULL
PRIMARY KEY ((site_id, user_id), n);
=

yes, you have to do so for each site_id, but it m= akes such bulk fetches much faster. (When I do so, I am always puzzled, why= I have to put NOT NULL for a part of a primary key).
And just in= case, I tried secondary indices on site_id. For such use they improve noth= ing.


But things are changing and we= realized that we want to get rid of clustering key, n.

DROP MATERIALIZED VIEW users_1;
DROP TABLE users;

CREATE TABLE users (
site_id int,
= user_id text,
data set<text>,
PRIMARY KEY ((site_= id, user_id)));

CREATE MATERIALIZED VIE= W users_1 AS
SELECT site_id, user_id, data
FROM users
WHERE site_id =3D 1 AND user_id IS NOT NULL=C2=A0
PRIMAR= Y KEY ((site_id, user_id));

And here I get t= he error I listed in the subject.=C2=A0
InvalidRequest: Error fro= m server: code=3D2200 [Invalid query] message=3D"No columns are define= d for Materialized View other than primary key"

But why? I still expect scans to be faster with MV. It appears to be= possible to create a dummy column and using as a clustering key. That'= s ugly.
--

Best Regards,

Alexander K= otelnikov
Team Lead

DIGINETICA

R= etail Technology Company

m: +7.921.915.06.28

www.diginetica.com





--
=

Best Regards,=

Alexander Kotelnikov
Team Lead

DIGINETICA

Retail Technology Company

m= : +7.921.915.06.28
<= /p>

www.diginetica.com

--001a114422544c640005589adcbb--