hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yin Huai <>
Subject Re: using the Hive SQL parser in Spark
Date Fri, 18 Dec 2015 21:17:41 GMT
Let me add Reynold to the thread.

On Fri, Dec 18, 2015 at 12:36 PM, Gopal Vijayaraghavan <>

> >We have looked into various options, and it looks like the best option is
> >to copy the ANTLR grammar file from Hive into Spark. Because the grammar
> >file is tightly coupled with Hive's semantic analysis, we need to refactor
> >some code to use them so it will end up becoming the .g file plus some
> >coupled code.
> Is the eventual goal to contribute that fork back into Hive & have Hive
> devs maintain a compatible parser for SparkSQL?
> Would that affect Hive's ability to refactor the SQL parser in the future
> or is this a one-time only deal?
> >parser. From Hive's perspective this does not provide any immediate
> >benefits. From Spark's perspective, we iterate very quickly so having to
> >depend on an external component also slow down our development. We also
> >have some requirements that simply don't apply in other projects (e.g.
> >being able to parse DataFrame expressions).
> From that I assume, this involves some form of cut-paste duplication of
> the code into SparkSQL project with that version diverging away from
> Hive's.
> > Thanks a lot for developing this parser, and we will try our best to
> > contribute back as we fix bugs. I will also make sure we have the proper
> > acknowledgment when we do this.
> Under the Apache license, there's no actual restriction against a hostile
> embrace-extend by copying hive's code verbatim as long as the fork retains
> license notices.
> The maintainability concerns are mostly around whether this is intended as
> an ongoing relationship, including any compatibility committments from
> hive-dev@.
> Cheers,
> Gopal

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message