Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7AE03200C31 for ; Wed, 8 Mar 2017 20:28:22 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7959E160B83; Wed, 8 Mar 2017 19:28:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C3300160B73 for ; Wed, 8 Mar 2017 20:28:21 +0100 (CET) Received: (qmail 11922 invoked by uid 500); 8 Mar 2017 19:28:21 -0000 Mailing-List: contact dev-help@calcite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@calcite.apache.org Delivered-To: mailing list dev@calcite.apache.org Received: (qmail 11910 invoked by uid 99); 8 Mar 2017 19:28:20 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Mar 2017 19:28:20 +0000 Received: from mail-ua0-f170.google.com (mail-ua0-f170.google.com [209.85.217.170]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 76DBB1A00C5 for ; Wed, 8 Mar 2017 19:28:20 +0000 (UTC) Received: by mail-ua0-f170.google.com with SMTP id f54so48716454uaa.1 for ; Wed, 08 Mar 2017 11:28:20 -0800 (PST) X-Gm-Message-State: AMke39nAf1erQwVHtaj++RADigXr2n0xIVcQqvPl6vzgXNGOlhsN1UcsZ4OABV5WkYL4rThcDaP8FSI73ax75A== X-Received: by 10.176.74.86 with SMTP id r22mr4383765uae.18.1489001298037; Wed, 08 Mar 2017 11:28:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.159.39.162 with HTTP; Wed, 8 Mar 2017 11:27:57 -0800 (PST) In-Reply-To: <88DEE4B2-8785-451B-A755-E2E90DB91CBA@hortonworks.com> References: <930D596C-C956-4237-8A0B-5D8D9BF8359C@apache.org> <3821D5B2-B454-454D-A249-9843CFCF0931@hortonworks.com> <2FEC6ED4-1FB7-46DA-A19E-15666C69F027@apache.org> <462A08A9-7C3D-4896-8D7B-69F0B72D5EE9@hortonworks.com> <88DEE4B2-8785-451B-A755-E2E90DB91CBA@hortonworks.com> From: Ashutosh Chauhan Date: Wed, 8 Mar 2017 11:27:57 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Cluster mismatch between RelNodes of a query and a materialized view To: dev@calcite.apache.org Content-Type: multipart/alternative; boundary=f403045f8ee4709ed1054a3d21c3 archived-at: Wed, 08 Mar 2017 19:28:22 -0000 --f403045f8ee4709ed1054a3d21c3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Working on LogicalRelNodes doesnt solve problem for Hive (or any other project which use RelFactories to generate custom rel node). Fact that RelFactories exist is evidence enough that LogicalRelNodes are not sufficient outside of Calcite. We have to solve this problem in that context. On Wed, Mar 8, 2017 at 10:50 AM, Remus Rusanu wrote: > To make sure I understand, you=E2=80=99re saying that the cloning would t= ake a > tree of various Xxx operators (eg. HiveProject) and generate a tree of > equivalent LogicalXxx operators (HiveProject -> LogicalProject, > HiveFilter->LogicalFilter, HiveAggregate -> LogicalAggregate and so on). > Is my understanding correct? Ignoring Hive issues, wouldn=E2=80=99t that = loose > information? > For Hive specific case, the HiveXxx operator extend Xxx not LogicalXxx > (not sure yet if this is an issue or not) but for sure they carry lots of > extra info that would be lost if the clone would contain LogicalXxx inste= ad > of HiveXxx. > > Thanks, > ~Remus > > On 3/8/17, 9:45 AM, "Julian Hyde" wrote: > > Could you work on LogicalXxx rather than HiveXxx? I know the Hive tea= m > likes to do everything in terms of HiveXxx RelNodes but I=E2=80=99m not s= ure it has > to be that way. > > Julian > > > On Mar 8, 2017, at 9:37 AM, Remus Rusanu > wrote: > > > > Agree on the RelOptCluster. > > > > I noticed how adding new method to RelNode yields a gargantuan task > (from the list of compile errors I got=E2=80=A6). But I=E2=80=99m not sur= e a RelShuttle can > handle this. For my test case the 3 nodes that need to be cloned are > HiveScanTable, HiveFilter and HiveProject, all declared in Hive and not > even extending AbstractRelNode but directly base RelNode. A RelShuttle > wouldn=E2=80=99t know about these types, and wouldn=E2=80=99t be able to = create them (short > of using reflection and adhering to a strict constructor signature, which= I > think is much too fragile). What I did to get the ball rolling I added a > default implementation in AbstractRelNode (basically assert =E2=80=98must= be > implemented by subclass=E2=80=99), this allowed me to test easily and, as= a proof > of concept, I have it working. But I reckon is fragile test wise, and new > RelNode types wouldn=E2=80=99t know about the requirement to provide a co= pyTo > implementation. > > > > Thanks, > > ~Remus > > > > On 3/8/17, 9:05 AM, "Julian Hyde" wrote: > > > > The argument should be a RelOptCluster, not a RelOptPlanner. The > link from RelNode to planner is indirect currently (via cluster) and will > be non-existent after CALCITE-1536. > > > > I question whether we need a new method. Putting an abstract > method on RelNode is a huge burden because every RelNode sub-class needs = to > be fixed when people upgrade. Even a non-abstract method imposes a > conceptual burden: more methods to understand. > > > > So, my approach would be to sub-class RelShuttle. It=E2=80=99s s= ufficient > that it only works for LogicalXxx nodes. > > > > No need to copy RexNode expressions. They are immutable. > > > > Julian > > > > > >> On Mar 8, 2017, at 4:14 AM, Remus Rusanu > wrote: > >> > >> I created CALCITE-1681 https://issues.apache.org/ > jira/browse/CALCITE-1681 and I intent to work on it for finishing > HIVE-15708. > >> My current thinking is to create a RelCopier based on RelShuttle > and add a new abstract RelNode.copyTo(RelOptPlanner) that each concrete R= el > type must override. The Rex part is already handled by the existing > RexCopier. > >> > >> Thanks, > >> ~Remus > >> > >> On 3/6/17, 12:30 PM, "Julian Hyde" wrote: > >> > >> Every RelNode belongs to a RelOptCluster, and basically there is > one RelOptCluster created each time a query is prepared. When working wit= h > materialized views, the view=E2=80=99s query is represented as a tree of = RelNodes, > that tree is used for optimizing more than one query. When planning a > particular query, the nodes of that query will have a different > RelOptCluster than the nodes of the materialized view(s) they are matched > against. > >> > >> How do we deal with this? Do we copy the nodes into the query=E2= =80=99s > cluster once we have found a match? If so, how? I couldn=E2=80=99t find a= sub-class > of RelVisitor or RelShuttle that copies trees to a different RelOptCluste= r. > >> > >> By the way, https://issues.apache.org/jira/browse/CALCITE-1536 < > https://issues.apache.org/jira/browse/CALCITE-1536> aims to improve the > RelNode life-cycle but I don=E2=80=99t think it will solve this problem. > >> > >> Julian > >> > >> > > > > > > > > > > > > > --f403045f8ee4709ed1054a3d21c3--