heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yaliang Wang <yalia...@twitter.com.INVALID>
Subject Re: Proposing Changes To Heron
Date Tue, 27 Feb 2018 00:51:26 GMT
Sounds like a very great feature to have. A question I have: will it be feasible to start a
separate project to support SQL on Heron-like streaming?

- I’m imaging that there will be a lot code similar/same to Storm SQL.
- Only the last step of the three steps(parse sql -> logical/physical plan -> heron
topology) you mentioned is specified for Heron. The first two steps can be shared for other
heron-like streaming vendors.
- The native support for SQL inside the Heron project will give extra advertising/marketing
bonus but with an increase of the code maintenance cost, especially, if it requires APIs that
not very popular and may be changed over time. However, a separate project can target a specific
version of Heron. 


> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <erenavsarogullari@gmail.com>
> +1 for Heron SQL Support. Thanks Josh.
> On 26 February 2018 at 18:42, Karthik Ramasamy <kramasamy@gmail.com> wrote:
>> Thanks Josh for initiating this. It will be a great feature to add for
>> Heron.
>> cheers
>> /karthik
>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <josh@joshfischer.io> wrote:
>>> Jerry,
>>> Great point.  Lets keep things simple for the migration to make sure the
>>> implementation is correct.  Then we can modify from there.
>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
>> jerry.boyang.peng@gmail.com>
>>> wrote:
>>>> Thanks Josh for taking the initiative to get this start!  SQL on Heron
>>>> will be a great feature! The plan sounds great to me.  Lets first get
>>>> an initial version of the Heron SQL out and then we can worry about
>>>> custom / user defined sources and sinks.  We can even start talking
>>>> about UDFs (User defined functions) at that point!
>>>> Best,
>>>> Jerry
>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <josh@joshfischer.io>
>> wrote:
>>>>> Please see this google drive link for adding comments.  I will copy and
>>>>> paste the drive doc below as well.
>>>>> https://docs.google.com/document/d/1PxLCyR_H-
>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>>>>> Proposal Below
>>>>> *I am writing this document to propose changes and to start
>> conversations
>>>>> on adding functionality similar to Storm SQL to Heron.  We would call
>> it
>>>>> Heron SQL.  After reviewing how the code is structured in Storm I have
>>>> some
>>>>> suggestions and questions relating to the implementation into the Heron
>>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar to
>>>>> Storm)- We would parse the sql with calcite to create the logical and
>>>>> physical plans- We would then convert the logical and physical plans
>> to a
>>>>> Heron Topology- We would then submit the Heron Topology into the Heron
>>>>> System - Some thoughts on code structure and overall functionality- I
>>>> think
>>>>> we should place the Heron SQL code base as a top level directory in the
>>>>> repo. - I will have to add the command “sql” to the Heron command
>>>> code
>>>>> in python.- As a first pass implementation users  can interact with
>> Heron
>>>>> SQL via the following command - heron sql <sql-file> <topology-name>-
>> We
>>>>> will also support the explain command for displaying the query plan,
>> this
>>>>> will not deploy the topology- heron sql <sql-file> --explain- After
>>>>> first pass implementation is working smoothly, we can then add an
>>>>> interactive command line interface to accept sql on the fly by omitting
>>>> the
>>>>> sql file argument- Heron sql <topology-name>- We would support
all of
>> the
>>>>> existing functionality in Storm SQL today with the exception of being
>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
>>>> topologies
>>>>> into Heron.  Similar to how you deploy topologies with the Streamlet,
>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this plan
>> to
>>>>> implement?- I believe we would have to supply an external jar at times
>> to
>>>>> connect to external data sources, such as reuse of kafka libraries or
>>>>> database drivers.  I see that Storm has few external connectors for
>>>> mongo,
>>>>> kafka, redis and hdfs.  Do we want to limit users to what we decide to
>>>>> build as connectors or do we want to give them the ability to load
>>>> external
>>>>> jars at submit time? I don’t think heron offers the ability to pass
>> extra
>>>>> jars to via the “--jars” or “--artifacts” flags like Storm does
>>>>> Would this be the correct way to pull in external jars?  Does anyone
>>>> have a
>>>>> different idea?  I’m thinking that this might be a v2 feature after
>>>> get
>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
>> anything
>>>> I
>>>>> missed?*

View raw message