Return-Path: X-Original-To: apmail-drill-dev-archive@www.apache.org Delivered-To: apmail-drill-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EEB7F18B33 for ; Tue, 17 Nov 2015 17:54:48 +0000 (UTC) Received: (qmail 52325 invoked by uid 500); 17 Nov 2015 17:54:48 -0000 Delivered-To: apmail-drill-dev-archive@drill.apache.org Received: (qmail 52268 invoked by uid 500); 17 Nov 2015 17:54:48 -0000 Mailing-List: contact dev-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list dev@drill.apache.org Received: (qmail 52255 invoked by uid 99); 17 Nov 2015 17:54:48 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Nov 2015 17:54:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id F3C76C2EAA for ; Tue, 17 Nov 2015 17:54:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.001 X-Spam-Level: *** X-Spam-Status: No, score=3.001 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=3, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=dremio_com.20150623.gappssmtp.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id NIPAxLiJEBPb for ; Tue, 17 Nov 2015 17:54:38 +0000 (UTC) Received: from mail-wm0-f46.google.com (mail-wm0-f46.google.com [74.125.82.46]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 7542E42B60 for ; Tue, 17 Nov 2015 17:54:38 +0000 (UTC) Received: by wmww144 with SMTP id w144so37307500wmw.0 for ; Tue, 17 Nov 2015 09:54:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dremio_com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=DUAWRPXZe9+xEZMOuiQ3Joh8YFDE/vF35SaiPFAy4wk=; b=U82sb/yiF+COt4mav8CSr3IV2GrUUJfTd0y0qm8HiyxKnattp2B/rpiX5LlNmI5YB5 saIDnqoKXsDsU34hSB7/de9BDHZhU3MWqNaMjoNr4tiLy6DzIO1psN2VrPCtbMdp5vjH DX0ob8Xlx+amqlefId75twRpq4ivwNfSrlw2eJYTmOffnMCgV11ifS94D6KqCmGVCvQk x8tisq6HO196EbDflNdAStrcKdy2+OqYP1mc8Bc0/j89N3K9ZOtjaprPgO+oKH3gdF9N AYJioONEQS21+HexrQBPMA7BZp7NXeJZqZbQBCfwN9pz5G9sgrKR5AKjXOmq+42EjCyy 2TWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=DUAWRPXZe9+xEZMOuiQ3Joh8YFDE/vF35SaiPFAy4wk=; b=jfk+GhVBjt+aawjx+YKwIsg+nPXZVWT/9w7sJj7CoCfJOuu+J6+xiMhMqJTl6C0Vor /+UmXCejYTfeTZDQPcOgTDg7cvTNGbNtaQQ+XEGJLZ3WBzE0+KgHRKMH6OGxipV0w6OY kg9BnaAsiL0N76Y8/J4TYOgqSZPxHM6fnaysvI4k7DLxXU1yp/nsoHmoXrzh+MPlHcoI lw+5p3ohmqWu97ZzYqsSN+ao3fARW1qyN1U3/2WVVZ3G7uj0r6Cf2Xn7ab0Yx/pWpcPv io3h/W+sWnKECKk0occxxKQt2Kyt0ITHRXLiKy/v0GI996Z2IFLQnjHv3PXZs8nJlaY3 Zj5A== X-Gm-Message-State: ALoCoQn2kHjRDlWmCrbc+/rTbj/5reqQ4GJoBF6+m61xPTNh+kxRMG2PyryO0rjZJ/Eh1TPgUwlm MIME-Version: 1.0 X-Received: by 10.194.84.4 with SMTP id u4mr44905886wjy.149.1447782877550; Tue, 17 Nov 2015 09:54:37 -0800 (PST) Received: by 10.27.205.215 with HTTP; Tue, 17 Nov 2015 09:54:37 -0800 (PST) In-Reply-To: References: Date: Tue, 17 Nov 2015 09:54:37 -0800 Message-ID: Subject: Re: [DISCUSS] Get off Calcite Forked Version From: Jacques Nadeau To: dev Content-Type: multipart/alternative; boundary=047d7bb04d3620f4a70524c038c4 --047d7bb04d3620f4a70524c038c4 Content-Type: text/plain; charset=UTF-8 I'm fine with any approach. My suggestion came from my sense that, so far, our burn down hasn't been super successful. And sometimes I feel like Jinfeng gets stuck doing all the heavy lifting. -- Jacques Nadeau CTO and Co-Founder, Dremio On Tue, Nov 17, 2015 at 9:48 AM, Julian Hyde wrote: > I can do a hackathon if it helps. You're approaching it as a "big bang" > but we could approach it as a burn down, counting the number of commits > different between drill's fork and master. There's a lot of incremental > value each commit we remove. > > Julian > > > On Nov 17, 2015, at 08:58, Jacques Nadeau wrote: > > > > Hey Jinfeng, > > > > Thanks for pulling this together. It is definitely time to get off the > > fork. I see a number of items with PCM. Should we do a hackathon to do > all > > the PCMs one afternoon? Maybe we could convince Julian to join so we > could > > make short progress of the issues. The original commits that have PCM > were > > done by the following developers: Mehant, Jinfeng, Jacques, Venki & Aman. > > Even if we just get Aman, Jinfeng and myself, we could probably close out > > most of the gaps. (Mehant only has one and it seems pretty small. Venki's > > is larger but is only one as well.) > > > > It seems like an achievable goal that we could get could get on top of > > 1.5.x master + star + 2-3 other patches for the 1.4 release. What are > your > > thoughts? > > > > > > > > -- > > Jacques Nadeau > > CTO and Co-Founder, Dremio > > > >> On Tue, Nov 10, 2015 at 2:55 PM, Jinfeng Ni > wrote: > >> > >> Julian, > >> > >> Thanks a lot for your detailed comments! > >> > >> I totally agree with you on "calcite first" policy. That's why on MapR > >> side, I recommend people get changes in Calcite master first, then > >> cherry-pick back. Several recent patches follow this patten, as we > >> realized the pain to maintain this calcite fork. > >> > >> I also agree that we should get them in ASAP and it is a good idea to > >> start with simple patch firsts. > >> > >> > >> 1. Why do you need "Add new method to ViewExpander interface to allow > >> passing SchemaRoot."? Most ViewExpander implementations have a root > >> schema. Note that the return type of that method has changed from > >> RelNode to RelRoot. > >> > >> [ > >> Venki might have better idea. As far as I remember, the patch was > >> merged to handle case where the view is created in a different schema > >> from the source schema. Drill uses Frameworks, not the regular code > >> path that Calcite master uses. Not sure if that's the cause of the > >> problem on Drill side. > >> ] > >> > >> 2. "Match Logical Rel in ReduceExpressionRule." should be easy now we've > >> parameterized the rules, and maybe you can get rid of "Enable > >> inheriting from previously singleton filter and calc constant > >> reduction rules.". > >> > >> [ Agree.] > >> > >> 3. "Disable the nullability cast checking in Filter." is worth > discussing > >> on the Calcite list. > >> > >> [ Will do that once I revise the patch with testcase.] > >> > >> 4. In "Return validated row type and validated SqlNode when call > >> Planner.validate()." you should return a Pair. > >> > >> [ The patch essentially returns Pair, since it > >> put SqlNode and RelDataType in a wrapper. With Calcite master, seems > >> RelRoot probably is a better choice; it includes everything that we > >> want. > >> ] > >> > >> > >> 5. To do "Use case-insensitive comparison when creating output row type > >> of a JoinRel." properly we require a new config parameter saying > >> whether internal field names are case-sensitive. > >> > >> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until > >> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute > >> a fix for CALCITE-777! :) > >> > >> [Aman will have better idea] > >> > >> 6. Why do you need "Make public the constructor of > >> SqlSumEmptyIsZeroAggFunction,"? > >> > >> [ It's used in Dril's ReduceAggregateRule. We should revisit this rule > >> to see whether it's still required to keep this rule in stead of using > >> Calcite's ReduceAggregateRule directly. > >> ] > >> > >> 7. You don't need "Make certain push project constructors protected such > >> that derived classes can access them." now we've parameterized > >> ProjectFilterTransposeRule. > >> > >> [Agree.] > >> > >> 8. "DRILL-1455: Add return type-inference strategy for arithmetic > >> operators when one of the arguments is ANY type." needs some > >> discussion. What if both are ANY? How are other implementations of the > >> operators (e.g. Calcite, Phoenix) going to implement the operator now > >> that Drill's lax validation policy has allowed it in? > >> > >> [ Will discuss in Calcite list once we have the PR ready for this > patch. > >> ] > >> > >> 9. I can't see yet how the [StarColumn] changes could make it in. Will > >> require some discussion. Note that CALCITE-546 has broken some of your > >> assumptions. > >> > >> [ I know it would be big headache to resolve the changes related to > >> star column for schema-on-read table. I have to re-visit those > >> changes, and probably start a discussion on Calcite list once I do a > >> rebase and code cleanup > >> ] > >> > >> Again, thanks for your feedback. I feel it's the right time for the > >> Drill community to try to get off calcite fork! > >> > >> Regards, > >> > >> Jinfeng > >> > >> > >>> On Tue, Nov 10, 2015 at 1:51 PM, Julian Hyde wrote: > >>> Going forward, I think you should start a "calcite first" policy -- > >>> get changes accepted into Calcite, with tests of course, then > >>> cherry-pick into Drill's branch. Recent changes like "Allow a MAP > >>> literal type." should have done that. And especially changes where you > >>> could just parameterize planner rule. In other words, stop digging the > >>> hole deeper. > >>> > >>> I think you should identify and do the easy ones ASAP. > >>> > >>> Rather than creating a branch for each of these contributions, if you > >>> prefer you could create a pull request of the whole branch I think > >>> that individual JIRAs can just reference a commit id that a Calcite > >>> committer can cherry-pick. > >>> > >>> Now some remarks on specific changes. > >>> > >>> Why do you need "Add new method to ViewExpander interface to allow > >>> passing SchemaRoot."? Most ViewExpander implementations have a root > >>> schema. Note that the return type of that method has changed from > >>> RelNode to RelRoot. > >>> > >>> "Match Logical Rel in ReduceExpressionRule." should be easy now we've > >>> parameterized the rules, and maybe you can get rid of "Enable > >>> inheriting from previously singleton filter and calc constant > >>> reduction rules.". > >>> > >>> "Disable the nullability cast checking in Filter." is worth discussing > >>> on the Calcite list. > >>> > >>> In "Return validated row type and validated SqlNode when call > >>> Planner.validate()." you should return a Pair. > >>> > >>> To do "Use case-insensitive comparison when creating output row type > >>> of a JoinRel." properly we require a new config parameter saying > >>> whether internal field names are case-sensitive. > >>> > >>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until > >>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute > >>> a fix for CALCITE-777! :) > >>> > >>> Why do you need "Make public the constructor of > >> SqlSumEmptyIsZeroAggFunction,"? > >>> > >>> You don't need "Make certain push project constructors protected such > >>> that derived classes can access them." now we've parameterized > >>> ProjectFilterTransposeRule. > >>> > >>> "DRILL-1455: Add return type-inference strategy for arithmetic > >>> operators when one of the arguments is ANY type." needs some > >>> discussion. What if both are ANY? How are other implementations of the > >>> operators (e.g. Calcite, Phoenix) going to implement the operator now > >>> that Drill's lax validation policy has allowed it in? > >>> > >>> I can't see yet how the [StarColumn] changes could make it in. Will > >>> require some discussion. Note that CALCITE-546 has broken some of your > >>> assumptions. > >>> > >>> Julian > >>> > >>> > >>> > >>>> On Tue, Nov 10, 2015 at 12:27 PM, Jinfeng Ni wrote: > >>>> Currently, Drill maintains a forked version of Calcite. Because of > >>>> this, it has been a big headache to rebase onto the latest Calcite > >>>> master branch every time when Calcite makes a new release. > >>>> > >>>> During today's Drill hangout, we discuss ideas around how to get Drill > >>>> off the forked Calcite, by either pushing back patches in the forked > >>>> version to Calcite master branch, or trying to see if those patches > >>>> are still required for Drill to work properly. > >>>> > >>>> There are 49 commits in forked Calcite whose base is Calcite 1.4.0 > >>>> release [1]. I put a spreadsheet for the current status of those 49 > >>>> commits, as far as I know [2]. Half of them are the patches that were > >>>> cherry-picked from Calcite master branch, 9 are just for version > >>>> change, and the rest requires either push back or re-work in Drill > >>>> side. > >>>> > >>>> I think we probably have to spend considerable effort to get this done > >>>> ASAP. Otherwise, it will continue to be an issue for Drill. > >>>> > >>>> There are two separate work to be done: > >>>> 1. Continue to rebase forked version onto latest Calcite master > >>>> (Currently it's 1.5.0) > >>>> 2. Push the patches in fork to Calcite master. Once all patches are > >>>> in, we could finally get rid of fork and rebasing efforts. > >>>> > >>>> The spreadsheet has a column for sign-up. > >>>> > >>>> Thoughts? > >>>> > >>>> (let me know if you have problem accessing the spreadsheet). > >>>> > >>>> > >>>> [1] https://github.com/dremio/calcite/tree/DrillCalcite1.4.0 > >>>> [2] > >> > https://docs.google.com/spreadsheets/d/1Z0J5aMCBfqq_9SEl-LXsi_B8Ahaosygqke4cH6pPwmo/edit#gid=0 > >> > --047d7bb04d3620f4a70524c038c4--