calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Remus Rusanu <>
Subject Re: Cluster mismatch between RelNodes of a query and a materialized view
Date Wed, 08 Mar 2017 18:50:31 GMT
To make sure I understand, you’re saying that the cloning would take a tree of various Xxx
operators (eg. HiveProject) and generate a tree of equivalent LogicalXxx  operators (HiveProject
-> LogicalProject, HiveFilter->LogicalFilter, HiveAggregate ->  LogicalAggregate
and so on). Is my understanding correct? Ignoring Hive issues, wouldn’t that loose information?

For Hive specific case, the HiveXxx operator extend Xxx not LogicalXxx (not sure yet if this
is an issue or not) but for sure they carry lots of extra info that would be lost if the clone
would contain LogicalXxx instead of HiveXxx.


On 3/8/17, 9:45 AM, "Julian Hyde" <> wrote:

    Could you work on LogicalXxx rather than HiveXxx? I know the Hive team likes to do everything
in terms of HiveXxx RelNodes but I’m not sure it has to be that way.
    > On Mar 8, 2017, at 9:37 AM, Remus Rusanu <> wrote:
    > Agree on the RelOptCluster.
    > I noticed how adding new method to RelNode yields a gargantuan task (from the list
of compile errors I got…). But I’m not sure a RelShuttle can handle this. For my test
case the 3 nodes that need to be cloned are HiveScanTable, HiveFilter and HiveProject, all
declared in Hive and not even extending AbstractRelNode but directly base RelNode. A RelShuttle
wouldn’t know about these types, and wouldn’t be able to create them (short of using reflection
and adhering to a strict constructor signature, which I think is much too fragile). What I
did to get the ball rolling I added a default implementation in AbstractRelNode (basically
assert ‘must be implemented by subclass’), this allowed me to test easily and, as a proof
of concept, I have it working. But I reckon is fragile test wise, and new RelNode types wouldn’t
know about the requirement to provide a copyTo implementation.
    > Thanks,
    > ~Remus 
    > On 3/8/17, 9:05 AM, "Julian Hyde" <> wrote:
    >    The argument should be a RelOptCluster, not a RelOptPlanner. The link from RelNode
to planner is indirect currently (via cluster) and will be non-existent after CALCITE-1536.
    >    I question whether we need a new method. Putting an abstract method on RelNode
is a huge burden because every RelNode sub-class needs to be fixed when people upgrade. Even
a non-abstract method imposes a conceptual burden: more methods to understand. 
    >    So, my approach would be to sub-class RelShuttle. It’s sufficient that it only
works for LogicalXxx nodes.
    >    No need to copy RexNode expressions. They are immutable.
    >    Julian
    >> On Mar 8, 2017, at 4:14 AM, Remus Rusanu <> wrote:
    >> I created CALCITE-1681 and
I intent to work on it for finishing HIVE-15708.
    >> My current thinking is to create a RelCopier based on RelShuttle and add a new
abstract RelNode.copyTo(RelOptPlanner) that each concrete Rel type must override. The Rex
part is already handled by the existing RexCopier.
    >> Thanks,
    >> ~Remus
    >> On 3/6/17, 12:30 PM, "Julian Hyde" <> wrote:
    >>   Every RelNode belongs to a RelOptCluster, and basically there is one RelOptCluster
created each time a query is prepared. When working with materialized views, the view’s
query is represented as a tree of RelNodes, that tree is used for optimizing more than one
query. When planning a particular query, the nodes of that query will have a different RelOptCluster
than the nodes of the materialized view(s) they are matched against. 
    >>   How do we deal with this? Do we copy the nodes into the query’s cluster once
we have found a match? If so, how? I couldn’t find a sub-class of RelVisitor or RelShuttle
that copies trees to a different RelOptCluster.
    >>   By the way, <>
aims to improve the RelNode life-cycle but I don’t think it will solve this problem.
    >>   Julian

View raw message