drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@dremio.com>
Subject Re: [DISCUSS] Get off Calcite Forked Version
Date Tue, 17 Nov 2015 17:54:37 GMT
I'm fine with any approach. My suggestion came from my sense that, so far,
our burn down hasn't been super successful. And sometimes I feel like
Jinfeng gets stuck doing all the heavy lifting.

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Tue, Nov 17, 2015 at 9:48 AM, Julian Hyde <jhyde.apache@gmail.com> wrote:

> I can do a hackathon if it helps. You're approaching it as a "big bang"
> but we could approach it as a burn down, counting the number of commits
> different between drill's fork and master.  There's a lot of incremental
> value each commit we remove.
>
> Julian
>
> > On Nov 17, 2015, at 08:58, Jacques Nadeau <jacques@dremio.com> wrote:
> >
> > Hey Jinfeng,
> >
> > Thanks for pulling this together. It is definitely time to get off the
> > fork. I see a number of items with PCM. Should we do a hackathon to do
> all
> > the PCMs one afternoon? Maybe we could convince Julian to join so we
> could
> > make short progress of the issues. The original commits that have PCM
> were
> > done by the following developers: Mehant, Jinfeng, Jacques, Venki & Aman.
> > Even if we just get Aman, Jinfeng and myself, we could probably close out
> > most of the gaps. (Mehant only has one and it seems pretty small. Venki's
> > is larger but is only one as well.)
> >
> > It seems like an achievable goal that we could get could get on top of
> > 1.5.x master + star + 2-3 other patches for the 1.4 release. What are
> your
> > thoughts?
> >
> >
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> >> On Tue, Nov 10, 2015 at 2:55 PM, Jinfeng Ni <jinfengni99@gmail.com>
> wrote:
> >>
> >> Julian,
> >>
> >> Thanks a lot for your detailed comments!
> >>
> >> I totally agree with you on "calcite first" policy. That's why on MapR
> >> side, I recommend people get changes in Calcite master first, then
> >> cherry-pick back. Several recent patches follow this patten, as we
> >> realized the pain to maintain this calcite fork.
> >>
> >> I also agree that we should get them in ASAP and it is a good idea to
> >> start with simple patch firsts.
> >>
> >>
> >> 1. Why do you need "Add new method to ViewExpander interface to allow
> >> passing SchemaRoot."? Most ViewExpander implementations have a root
> >> schema. Note that the return type of that method has changed from
> >> RelNode to RelRoot.
> >>
> >> [
> >> Venki might have better idea. As far as I remember, the patch was
> >> merged to handle case where the view is created in a different schema
> >> from the source schema. Drill uses Frameworks, not the regular code
> >> path that Calcite master uses. Not sure if that's the cause of the
> >> problem on Drill side.
> >> ]
> >>
> >> 2. "Match Logical Rel in ReduceExpressionRule." should be easy now we've
> >> parameterized the rules, and maybe you can get rid of "Enable
> >> inheriting from previously singleton filter and calc constant
> >> reduction rules.".
> >>
> >> [ Agree.]
> >>
> >> 3. "Disable the nullability cast checking in Filter." is worth
> discussing
> >> on the Calcite list.
> >>
> >> [ Will do that once I revise the patch with testcase.]
> >>
> >> 4. In "Return validated row type and validated SqlNode when call
> >> Planner.validate()." you should return a Pair<SqlNode, RelDataType>.
> >>
> >> [ The patch essentially returns Pair<SqlNode, RelDataType>, since it
> >> put SqlNode and RelDataType in a wrapper. With Calcite master, seems
> >> RelRoot probably is a better choice; it includes everything that we
> >> want.
> >> ]
> >>
> >>
> >> 5. To do "Use case-insensitive comparison when creating output row type
> >> of a JoinRel." properly we require a new config parameter saying
> >> whether internal field names are case-sensitive.
> >>
> >> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until
> >> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute
> >> a fix for CALCITE-777! :)
> >>
> >> [Aman will have better idea]
> >>
> >> 6. Why do you need "Make public the constructor of
> >> SqlSumEmptyIsZeroAggFunction,"?
> >>
> >> [ It's used in Dril's ReduceAggregateRule. We should revisit this rule
> >> to see whether it's still required to keep this rule in stead of using
> >> Calcite's ReduceAggregateRule directly.
> >> ]
> >>
> >> 7. You don't need "Make certain push project constructors protected such
> >> that derived classes can access them." now we've parameterized
> >> ProjectFilterTransposeRule.
> >>
> >> [Agree.]
> >>
> >> 8. "DRILL-1455: Add return type-inference strategy for arithmetic
> >> operators when one of the arguments is ANY type." needs some
> >> discussion. What if both are ANY? How are other implementations of the
> >> operators (e.g. Calcite, Phoenix) going to implement the operator now
> >> that Drill's lax validation policy has allowed it in?
> >>
> >> [  Will discuss in Calcite list once we have the PR ready for this
> patch.
> >> ]
> >>
> >> 9. I can't see yet how the [StarColumn] changes could make it in. Will
> >> require some discussion. Note that CALCITE-546 has broken some of your
> >> assumptions.
> >>
> >> [ I know it would be big headache to resolve the changes related to
> >> star column for schema-on-read table.  I have to re-visit those
> >> changes, and probably start a discussion on Calcite list once I do a
> >> rebase and code cleanup
> >> ]
> >>
> >> Again, thanks for your feedback. I feel it's the right time for the
> >> Drill community to try to get off calcite fork!
> >>
> >> Regards,
> >>
> >> Jinfeng
> >>
> >>
> >>> On Tue, Nov 10, 2015 at 1:51 PM, Julian Hyde <jhyde@apache.org> wrote:
> >>> Going forward, I think you should start a "calcite first" policy --
> >>> get changes accepted into Calcite, with tests of course, then
> >>> cherry-pick into Drill's branch. Recent changes like "Allow a MAP
> >>> literal type." should have done that. And especially changes where you
> >>> could just parameterize planner rule. In other words, stop digging the
> >>> hole deeper.
> >>>
> >>> I think you should identify and do the easy ones ASAP.
> >>>
> >>> Rather than creating a branch for each of these contributions, if you
> >>> prefer you could create a pull request of the whole branch I think
> >>> that individual JIRAs can just reference a commit id that a Calcite
> >>> committer can cherry-pick.
> >>>
> >>> Now some remarks on specific changes.
> >>>
> >>> Why do you need "Add new method to ViewExpander interface to allow
> >>> passing SchemaRoot."? Most ViewExpander implementations have a root
> >>> schema. Note that the return type of that method has changed from
> >>> RelNode to RelRoot.
> >>>
> >>> "Match Logical Rel in ReduceExpressionRule." should be easy now we've
> >>> parameterized the rules, and maybe you can get rid of "Enable
> >>> inheriting from previously singleton filter and calc constant
> >>> reduction rules.".
> >>>
> >>> "Disable the nullability cast checking in Filter." is worth discussing
> >>> on the Calcite list.
> >>>
> >>> In "Return validated row type and validated SqlNode when call
> >>> Planner.validate()." you should return a Pair<SqlNode, RelDataType>.
> >>>
> >>> To do "Use case-insensitive comparison when creating output row type
> >>> of a JoinRel." properly we require a new config parameter saying
> >>> whether internal field names are case-sensitive.
> >>>
> >>> Since "Don't use 'SumEmptyIsZero' (SUM0) window aggregate until
> >>> CALCITE-777 is fixed." requires a fix to CALCITE-777 please contribute
> >>> a fix for CALCITE-777! :)
> >>>
> >>> Why do you need "Make public the constructor of
> >> SqlSumEmptyIsZeroAggFunction,"?
> >>>
> >>> You don't need "Make certain push project constructors protected such
> >>> that derived classes can access them." now we've parameterized
> >>> ProjectFilterTransposeRule.
> >>>
> >>> "DRILL-1455: Add return type-inference strategy for arithmetic
> >>> operators when one of the arguments is ANY type." needs some
> >>> discussion. What if both are ANY? How are other implementations of the
> >>> operators (e.g. Calcite, Phoenix) going to implement the operator now
> >>> that Drill's lax validation policy has allowed it in?
> >>>
> >>> I can't see yet how the [StarColumn] changes could make it in. Will
> >>> require some discussion. Note that CALCITE-546 has broken some of your
> >>> assumptions.
> >>>
> >>> Julian
> >>>
> >>>
> >>>
> >>>> On Tue, Nov 10, 2015 at 12:27 PM, Jinfeng Ni <jni@apache.org>
wrote:
> >>>> Currently, Drill maintains a forked version of Calcite. Because of
> >>>> this, it has been a big headache to rebase onto the latest Calcite
> >>>> master branch every time when Calcite makes a new release.
> >>>>
> >>>> During today's Drill hangout, we discuss ideas around how to get Drill
> >>>> off the forked Calcite, by either pushing back patches in the forked
> >>>> version to Calcite master branch, or trying to see if those patches
> >>>> are still required for Drill to work properly.
> >>>>
> >>>> There are 49 commits in forked Calcite whose base is Calcite 1.4.0
> >>>> release [1]. I put a spreadsheet for the current status of those 49
> >>>> commits, as far as I know [2]. Half of them are the patches that were
> >>>> cherry-picked from Calcite master branch,  9 are just for version
> >>>> change, and the rest requires either push back or re-work in Drill
> >>>> side.
> >>>>
> >>>> I think we probably have to spend considerable effort to get this done
> >>>> ASAP. Otherwise, it will continue to be an issue for Drill.
> >>>>
> >>>> There are two separate work to be done:
> >>>> 1. Continue to rebase forked version onto latest Calcite master
> >>>> (Currently it's 1.5.0)
> >>>> 2. Push the patches in fork to Calcite master. Once all patches are
> >>>> in, we could finally get rid of fork and rebasing efforts.
> >>>>
> >>>> The spreadsheet has a column for sign-up.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> (let me know if you have problem accessing the spreadsheet).
> >>>>
> >>>>
> >>>> [1] https://github.com/dremio/calcite/tree/DrillCalcite1.4.0
> >>>> [2]
> >>
> https://docs.google.com/spreadsheets/d/1Z0J5aMCBfqq_9SEl-LXsi_B8Ahaosygqke4cH6pPwmo/edit#gid=0
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message