Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 21B12200C36 for ; Fri, 10 Mar 2017 08:38:28 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 204E9160B79; Fri, 10 Mar 2017 07:38:28 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 66270160B69 for ; Fri, 10 Mar 2017 08:38:27 +0100 (CET) Received: (qmail 90456 invoked by uid 500); 10 Mar 2017 07:38:26 -0000 Mailing-List: contact dev-help@calcite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@calcite.apache.org Delivered-To: mailing list dev@calcite.apache.org Received: (qmail 90442 invoked by uid 99); 10 Mar 2017 07:38:26 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Mar 2017 07:38:26 +0000 Received: from [192.168.1.6] (111.231.99.195.dyn.plus.net [195.99.231.111]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 2493D1A0193; Fri, 10 Mar 2017 07:38:24 +0000 (UTC) User-Agent: Microsoft-MacOutlook/0.0.0.151105 Date: Fri, 10 Mar 2017 07:38:19 +0000 Subject: Re: Diff in Materialization views registration between Calcite 1.10 and calcite 1.12 From: Jesus Camacho Rodriguez Sender: Jesus Camacho Rodriguez To: "dev@calcite.apache.org" , Maryann Xue Message-ID: Thread-Topic: Diff in Materialization views registration between Calcite 1.10 and calcite 1.12 References: <02FBBE0D-2399-4778-A6F8-E0DAFFF66FC8@apache.org> In-Reply-To: <02FBBE0D-2399-4778-A6F8-E0DAFFF66FC8@apache.org> Mime-version: 1.0 Content-type: text/plain; charset="UTF-8" Content-transfer-encoding: quoted-printable archived-at: Fri, 10 Mar 2017 07:38:28 -0000 There are two different objects associated with the view: 1) the view itsel= f (TableScan on a materialized view) and 2) the view content (RelNode plan r= epresenting the query of the view). 1) I understand that the first object (TS on view) should be registered, as= it might be part of the optimization process and the final query plan. Howe= ver, I think that is the case, as the TS and other possible operators on top= of it to unify the expressions are created in the context of the user query= cluster. 2) However, is it necessary to register the materialized view associated qu= ery using _registerImpl_? The view query has a slightly different nature as = it is not part of the user query and it will not be part of the final plan, = it would add more nodes to the planning phase, and rules will be triggered o= n those nodes too if I am not mistaken? That would increase optimization tim= e/complexity unnecessarily for large number of views/nodes in views? Unless I am missing something, I think we should avoid calling _registerImp= l_ for 2). -- Jes=C3=BAs On 3/9/17, 9:17 PM, "Julian Hyde" wrote: >So, the question is whether materialized views need to be =E2=80=9Cregistered=E2=80=9D= with the planner before they can be considered. If they are =E2=80=9Cregistered=E2=80= =9D with a Volcano planner this means that they are included in equivalence cl= asses (RelSets and RelSubsets) and canonized. > >A weaker form of registration is to make sure that the types used (in both= the row-type of a RelNode and in the various RexNodes contained therein) al= l come from the same type factory. > >Clearly there are advantages to registration (if objects are canonized the= y use less memory and can be compared using =3D=3D) and there are negatives (sig= nificant copying is involved). So, the question is whether we can use some k= ind of compromise: work on un-registered RelNodes at an early stage (while f= iguring out which materialized views might pertain to a query) and register = only when we have narrowed down the set of materialized views. > >Maryann, > >Since you did https://issues.apache.org/jira/browse/CALCITE-1500 , can you comment on this change? > >Julian > > >> On Mar 9, 2017, at 1:09 PM, Remus Rusanu wrote= : >>=20 >> Moving to calcite-dev >>=20 >> From: Remus Rusanu >> Date: Thursday, March 9, 2017 at 1:04 PM >> To: Ashutosh Chauhan , Julian Hyde >> Cc: "sqlopt@hortonworks.com" >> Subject: Why Calcite 1.10 is not hitting the assert >>=20 >> The 1.12 relevant assert stack is this: >> at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(Vol= canoPlanner.java:1475) >> at org.apache.calcite.plan.volcano.VolcanoPlanner.registerMaterial= izations(VolcanoPlanner.java:368) >> at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(Volc= anoPlanner.java:592) >> at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAc= tion.apply(CalcitePlanner.java:1467) >>=20 >> In 1.10 the names are a bit different, but VolcanoPlanner.findBestExp() = calls useApplicableMaterializations() which exists immediately, because cont= ext.unwrap(CalciteConnectionConfig.class) returns null. So no =E2=80=98registratio= n=E2=80=99 occurs (registerImpl is never called with the provided materialization = plan, as per my debugging). >>=20 >> However, when needed, the materialization is found. This stack bellow fi= nds it, and uses it, despite not being =E2=80=98registered=E2=80=99: >> at org.apache.calcite.plan.volcano.VolcanoPlanner.getMaterializati= ons(VolcanoPlanner.java:348) >> at org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMat= erializedViewFilterScanRule.apply(HiveMaterializedViewFilterScanRule.java:71= ) >> at org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMat= erializedViewFilterScanRule.onMatch(HiveMaterializedViewFilterScanRule.java:= 64) >> at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(Volcano= RuleCall.java:213) >> at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(Volc= anoPlanner.java:819) >> at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAc= tion.apply(CalcitePlanner.java:1455) >>=20 >> The result is the desired one: >>=20 >> hive> create materialized view srcm enable rewrite as select key from sr= c where key=3D10; >> =E2=80=A6 >> hive> explain extended select key from src where key=3D10; >> OK >> STAGE DEPENDENCIES: >> Stage-0 is a root stage >>=20 >> STAGE PLANS: >> Stage: Stage-0 >> Fetch Operator >> limit: -1 >> Processor Tree: >> TableScan >> alias: default.srcm >> GatherStats: false >> Select Operator >> expressions: key (type: string) >> outputColumnNames: _col0 >> ListSink >>=20 >> The big changes came with CALCITE-1500 >>=20 >