heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yaliang Wang <yalia...@twitter.com.INVALID>
Subject Re: Proposing Changes To Heron
Date Tue, 27 Feb 2018 04:07:27 GMT
Josh,

Totally agree with your concern. I was bringing that idea into conversation and thought that
as a back up solution. Since Heron is getting more and more popular, it would be really nice
to have SQL support. I think having a built-in Heron SQL can shorten the development iteration
since we will have less concern of abstraction and generalization in implementation.

Best,
Yaliang

> On Feb 26, 2018, at 7:17 PM, Josh Fischer <josh@joshfischer.io> wrote:
> 
> Yaliang,
> 
> I think this is a fantastic idea and I agree about the code maintenance
> being a cost.   I have a concern that creating a smaller project may get
> abandoned, especially if it had a smaller following.   One of the nice
> things about Heron is the large community and list of core contributors
> behind it.  But, I don't want to abandon this idea.  I think, for me at
> least, that it would make sense to get Storm SQL running in Heron and take
> what we learned from that experience and apply it to a third part project
> if there is a need/demand for it.  What do you think?
> 
> -Josh
> 
> On Mon, Feb 26, 2018 at 6:51 PM, Yaliang Wang <yaliangw@twitter.com.invalid>
> wrote:
> 
>> Sounds like a very great feature to have. A question I have: will it be
>> feasible to start a separate project to support SQL on Heron-like streaming?
>> 
>> - I’m imaging that there will be a lot code similar/same to Storm SQL.
>> - Only the last step of the three steps(parse sql -> logical/physical plan
>> -> heron topology) you mentioned is specified for Heron. The first two
>> steps can be shared for other heron-like streaming vendors.
>> - The native support for SQL inside the Heron project will give extra
>> advertising/marketing bonus but with an increase of the code maintenance
>> cost, especially, if it requires APIs that not very popular and may be
>> changed over time. However, a separate project can target a specific
>> version of Heron.
>> 
>> Best,
>> Yaliang
>> 
>>> On Feb 26, 2018, at 12:48 PM, Eren Avsarogullari <
>> erenavsarogullari@gmail.com> wrote:
>>> 
>>> +1 for Heron SQL Support. Thanks Josh.
>>> 
>>> On 26 February 2018 at 18:42, Karthik Ramasamy <kramasamy@gmail.com>
>> wrote:
>>> 
>>>> Thanks Josh for initiating this. It will be a great feature to add for
>>>> Heron.
>>>> 
>>>> cheers
>>>> /karthik
>>>> 
>>>>> On Feb 26, 2018, at 11:11 AM, Josh Fischer <josh@joshfischer.io>
>> wrote:
>>>>> 
>>>>> Jerry,
>>>>> 
>>>>> Great point.  Lets keep things simple for the migration to make sure
>> the
>>>>> implementation is correct.  Then we can modify from there.
>>>>> 
>>>>> On Sun, Feb 25, 2018 at 11:28 PM, Jerry Peng <
>>>> jerry.boyang.peng@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Thanks Josh for taking the initiative to get this start!  SQL on
Heron
>>>>>> will be a great feature! The plan sounds great to me.  Lets first
get
>>>>>> an initial version of the Heron SQL out and then we can worry about
>>>>>> custom / user defined sources and sinks.  We can even start talking
>>>>>> about UDFs (User defined functions) at that point!
>>>>>> 
>>>>>> Best,
>>>>>> 
>>>>>> Jerry
>>>>>> 
>>>>>> On Sun, Feb 25, 2018 at 9:05 PM, Josh Fischer <josh@joshfischer.io>
>>>> wrote:
>>>>>>> Please see this google drive link for adding comments.  I will
copy
>> and
>>>>>>> paste the drive doc below as well.
>>>>>>> 
>>>>>>> https://docs.google.com/document/d/1PxLCyR_H-
>>>>>> mOgPjyFj3DhWXryKW21CH2zFWwzTnqjfEA/edit?usp=sharing
>>>>>>> 
>>>>>>> 
>>>>>>> Proposal Below
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> *I am writing this document to propose changes and to start
>>>> conversations
>>>>>>> on adding functionality similar to Storm SQL to Heron.  We would
call
>>>> it
>>>>>>> Heron SQL.  After reviewing how the code is structured in Storm
I
>> have
>>>>>> some
>>>>>>> suggestions and questions relating to the implementation into
the
>> Heron
>>>>>>> code base. - High Level Overview Of Code Workflow (Keeping Similar
to
>>>>>>> Storm)- We would parse the sql with calcite to create the logical
and
>>>>>>> physical plans- We would then convert the logical and physical
plans
>>>> to a
>>>>>>> Heron Topology- We would then submit the Heron Topology into
the
>> Heron
>>>>>>> System - Some thoughts on code structure and overall functionality-
I
>>>>>> think
>>>>>>> we should place the Heron SQL code base as a top level directory
in
>> the
>>>>>>> repo. - I will have to add the command “sql” to the Heron
command
>> line
>>>>>> code
>>>>>>> in python.- As a first pass implementation users  can interact
with
>>>> Heron
>>>>>>> SQL via the following command - heron sql <sql-file> <topology-name>-
>>>> We
>>>>>>> will also support the explain command for displaying the query
plan,
>>>> this
>>>>>>> will not deploy the topology- heron sql <sql-file> --explain-
After
>> the
>>>>>>> first pass implementation is working smoothly, we can then add
an
>>>>>>> interactive command line interface to accept sql on the fly by
>> omitting
>>>>>> the
>>>>>>> sql file argument- Heron sql <topology-name>- We would
support all of
>>>> the
>>>>>>> existing functionality in Storm SQL today with the exception
of being
>>>>>>> dependent on trident.  We would use Storm SQL as a way to deploy
>>>>>> topologies
>>>>>>> into Heron.  Similar to how you deploy topologies with the Streamlet,
>>>>>>> Topology, and ECO APIs- Questions- Do we see any issue with this
plan
>>>> to
>>>>>>> implement?- I believe we would have to supply an external jar
at
>> times
>>>> to
>>>>>>> connect to external data sources, such as reuse of kafka libraries
or
>>>>>>> database drivers.  I see that Storm has few external connectors
for
>>>>>> mongo,
>>>>>>> kafka, redis and hdfs.  Do we want to limit users to what we
decide
>> to
>>>>>>> build as connectors or do we want to give them the ability to
load
>>>>>> external
>>>>>>> jars at submit time? I don’t think heron offers the ability
to pass
>>>> extra
>>>>>>> jars to via the “--jars” or “--artifacts” flags like
Storm does
>> today.
>>>>>>> Would this be the correct way to pull in external jars?  Does
anyone
>>>>>> have a
>>>>>>> different idea?  I’m thinking that this might be a v2 feature
after
>> we
>>>>>> get
>>>>>>> Heron sql working well.  Ideas, thoughts or concerns?- Is there
>>>> anything
>>>>>> I
>>>>>>> missed?*
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message