calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jesus Camacho Rodriguez <jcama...@apache.org>
Subject Re: Diff in Materialization views registration between Calcite 1.10 and calcite 1.12
Date Fri, 10 Mar 2017 07:38:19 GMT
There are two different objects associated with the view: 1) the view itself (TableScan on
a materialized view) and 2) the view content (RelNode plan representing the query of the view).

1) I understand that the first object (TS on view) should be registered, as it might be part
of the optimization process and the final query plan. However, I think that is the case, as
the TS and other possible operators on top of it to unify the expressions are created in the
context of the user query cluster.

2) However, is it necessary to register the materialized view associated query using _registerImpl_?
The view query has a slightly different nature as it is not part of the user query and it
will not be part of the final plan, it would add more nodes to the planning phase, and rules
will be triggered on those nodes too if I am not mistaken? That would increase optimization
time/complexity unnecessarily for large number of views/nodes in views?

Unless I am missing something, I think we should avoid calling _registerImpl_ for 2).


--
Jesús


On 3/9/17, 9:17 PM, "Julian Hyde" <jhyde@apache.org> wrote:

>So, the question is whether materialized views need to be “registered” with the planner
before they can be considered. If they are “registered” with a Volcano planner this means
that they are included in equivalence classes (RelSets and RelSubsets) and canonized.
>
>A weaker form of registration is to make sure that the types used (in both the row-type
of a RelNode and in the various RexNodes contained therein) all come from the same type factory.
>
>Clearly there are advantages to registration (if objects are canonized they use less memory
and can be compared using ==) and there are negatives (significant copying is involved). So,
the question is whether we can use some kind of compromise: work on un-registered RelNodes
at an early stage (while figuring out which materialized views might pertain to a query) and
register only when we have narrowed down the set of materialized views.
>
>Maryann,
>
>Since you did https://issues.apache.org/jira/browse/CALCITE-1500 <https://issues.apache.org/jira/browse/CALCITE-1500>,
can you comment on this change?
>
>Julian
>
>
>> On Mar 9, 2017, at 1:09 PM, Remus Rusanu <rrusanu@hortonworks.com> wrote:
>> 
>> Moving to calcite-dev
>> 
>> From: Remus Rusanu <rrusanu@hortonworks.com>
>> Date: Thursday, March 9, 2017 at 1:04 PM
>> To: Ashutosh Chauhan <ashutosh@hortonworks.com>, Julian Hyde <jhyde@hortonworks.com>
>> Cc: "sqlopt@hortonworks.com" <sqlopt@hortonworks.com>
>> Subject: Why Calcite 1.10 is not hitting the assert
>> 
>> The 1.12 relevant assert stack is this:
>>       at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1475)
>>       at org.apache.calcite.plan.volcano.VolcanoPlanner.registerMaterializations(VolcanoPlanner.java:368)
>>       at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:592)
>>       at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1467)
>> 
>> In 1.10 the names are a bit different, but VolcanoPlanner.findBestExp() calls useApplicableMaterializations()
which exists immediately, because context.unwrap(CalciteConnectionConfig.class) returns null.
So no ‘registration’ occurs (registerImpl is never called with the provided materialization
plan, as per my debugging).
>> 
>> However, when needed, the materialization is found. This stack bellow finds it, and
uses it, despite not being ‘registered’:
>>       at org.apache.calcite.plan.volcano.VolcanoPlanner.getMaterializations(VolcanoPlanner.java:348)
>>       at org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewFilterScanRule.apply(HiveMaterializedViewFilterScanRule.java:71)
>>       at org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewFilterScanRule.onMatch(HiveMaterializedViewFilterScanRule.java:64)
>>       at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:213)
>>       at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:819)
>>       at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1455)
>> 
>> The result is the desired one:
>> 
>> hive> create materialized view srcm enable rewrite as select key from src where
key=10;
>> …
>> hive> explain extended select key from src where key=10;
>> OK
>> STAGE DEPENDENCIES:
>>  Stage-0 is a root stage
>> 
>> STAGE PLANS:
>>  Stage: Stage-0
>>    Fetch Operator
>>      limit: -1
>>      Processor Tree:
>>        TableScan
>>          alias: default.srcm
>>          GatherStats: false
>>          Select Operator
>>            expressions: key (type: string)
>>            outputColumnNames: _col0
>>            ListSink
>> 
>> The big changes came with CALCITE-1500
>> 
>


Mime
View raw message