flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: SQL on Flink
Date Wed, 27 May 2015 14:41:45 GMT
+1 for committer passion!

Please don't get me wrong, I think SQL on Flink would be a great feature.
I just wanted to make the point that the Table API needs to mirror all SQL
features, if SQL is implemented on top of the Table API.


2015-05-27 16:34 GMT+02:00 Kostas Tzoumas <ktzoumas@apache.org>:

> I think Fabian's arguments make a lot of sense.
>
> However, if Timo *really wants* to start SQL on top of Table, that is what
> he will do a great job at :-) As usual, we can keep it in beta status in
> flink-staging until it is mature... and it will help create issues for the
> Table API and give direction to its development. Perhaps we will have a
> feature-poor SQL for a bit, then switch to hardening the Table API to
> support more features and then back to SQL.
>
> I'm just advocating for "committer passion"-first here :-) Perhaps Timo
> should weight in
>
> On Wed, May 27, 2015 at 4:19 PM, Fabian Hueske <fhueske@gmail.com> wrote:
>
> > IMO, it is better to have one feature that is reasonably well developed
> > instead of two half-baked features. That's why I proposed to advance the
> > Table API a bit further before starting the next big thing. I played
> around
> > with the Table API recently and I think it definitely needs a bit more
> > contributor attention and more features to be actually usable. Also since
> > all features of the SQL interface need to be included in the Table API
> > (given we follow the SQL on Table approach) it makes sense IMO to push
> the
> > Table API a bit further before going for the next thing.
> >
> > 2015-05-27 16:06 GMT+02:00 Stephan Ewen <sewen@apache.org>:
> >
> > > I see no reason why a SQL interface cannot be "bootstrapped"
> > concurrently.
> > > It would initially not support many operations,
> > > but would act as a good source to test and drive functionality from the
> > > Table API.
> > >
> > >
> > > @Ted:
> > >
> > > I would like to learn a bit more about the stack and internal
> > abstractions
> > > of Drill. It may make sense to
> > > reuse some of the query execution operators from Drill. I especially
> like
> > > the "learning schema on the fly" part of drill.
> > >
> > > Flink DataSets and Streams have a schema, but it may in several cases
> be
> > a
> > > "schema lower bound", like the greatest common superclass.
> > > Those cases may benefit big time from Drill's ability to refine schema
> on
> > > the fly.
> > >
> > > That may be useful also in the Table API, making it again available to
> > > LINQ-like programs, and SQL scripts.
> > >
> > > On Wed, May 27, 2015 at 3:49 PM, Robert Metzger <rmetzger@apache.org>
> > > wrote:
> > >
> > > > I didn't know that paper...  Thanks for sharing.
> > > >
> > > > I've worked on a SQL layer for Stratosphere some time ago, using
> Apache
> > > > Calcite (called Optiq back then). I think the project provides a lot
> of
> > > > very good tooling for creating a SQL layer. So if we decide to go for
> > SQL
> > > > on Flink, I would suggest to use Calcite.
> > > > I can also help you a bit with Calcite to get started with it.
> > > >
> > > > I agree with Fabian that it would probably make more sense for now to
> > > > enhance the Table API.
> > > > I think the biggest limitation right now is that it only supports
> > POJOs.
> > > > We should also support Tuples (I know thats difficult to do), data
> from
> > > > HCatalog (that includes parquet & orc), JSON, ...
> > > > Then, I would add filter and projection pushdown into the table API.
> > > >
> > > >
> > > >
> > > > On Tue, May 26, 2015 at 10:03 PM, Ted Dunning <ted.dunning@gmail.com
> >
> > > > wrote:
> > > >
> > > > > It would also be relatively simple (I think) to retarget drill to
> > Flink
> > > > if
> > > > > Flink doesn't provide enough typing meta-data to do traditional
> SQL.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, May 26, 2015 at 12:52 PM, Fabian Hueske <fhueske@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Flink's Table API is pretty close to what SQL provides. IMO,
the
> > best
> > > > > > approach would be to leverage that and build a SQL parser (maybe
> > > > together
> > > > > > with a logical optimizer) on top of the Table API. Parser (and
> > > > optimizer)
> > > > > > could be built using Apache Calcite which is providing exactly
> > this.
> > > > > >
> > > > > > Since the Table API is still a fairly new component and not
very
> > > > feature
> > > > > > rich, it might make sense to extend and strengthen it before
> > putting
> > > > > > something major on top.
> > > > > >
> > > > > > Cheers, Fabian
> > > > > >
> > > > > > 2015-05-26 21:38 GMT+02:00 Timo Walther <twalthr@apache.org>:
> > > > > >
> > > > > > > Hey everyone,
> > > > > > >
> > > > > > > I would be interested in having a complete SQL API in Flink.
> How
> > is
> > > > the
> > > > > > > status there? Is someone already working on it? If not,
I would
> > > like
> > > > to
> > > > > > > work on it. I found
> > > http://ijcsi.org/papers/IJCSI-12-1-1-169-174.pdf
> > > > > but
> > > > > > > I couldn't find anything on the mailing list or Jira.
> Otherwise I
> > > > would
> > > > > > > open an issue and start a discussion about it there.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Timo
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message