quickstep-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jignesh Patel <jipa...@pivotal.io>
Subject Re: [jira] [Commented] (QUICKSTEP-20) Add parser support for SQL window aggregation function
Date Wed, 15 Jun 2016 15:15:57 GMT
Great points Julian, especially about algebra. Couldn’t agree more.

In fact, we have been strong advocates of the viewpoint that it is all about the algebraic
framework. Furthermore, we have argued that the relational algebraic framework is the right
“core” to build a platform. With it you can go well beyond warehousing/SQL but also (with
small extensions) build:

#1: JSON document stores (see Argo <http://pages.cs.wisc.edu/~chasseur/pubs/argo-short.pdf>),

#2: Iterative graph analytics (see Grail <http://www.cs.wisc.edu/~jignesh/publ/Grail.pdf>),

#3: Relational learning (see QuickFOIL <http://www.cs.wisc.edu/~jignesh/publ/QuickFoil.pdf>),

#4: Biological data management (see Periscope/SQ <http://www.vldb.org/conf/2007/papers/demo/p1406-tata.pdf>
and Periscope/GQ <http://www.vldb.org/pvldb/1/1454184.pdf>).

If all of that is not enough, there are nice synergies between deeper integration of common
classes of machine learning and relational data representation. A key idea here is factorized
learning, which my student Arun Kumar (co-advised with Naughton) introduced last year <http://pages.cs.wisc.edu/~arun/orion/LearningOverJoinsSIGMOD.pdf>.
Arun will present a far deeper follow-on paper <http://pages.cs.wisc.edu/~arun/hamlet/OptFSSIGMOD.pdf>
on this topic at SIGMOD in a few weeks. Interestingly, many other papers are starting to build
on these initial ideas. There is still a bunch of theory to figure out, as a research community,
we are collectively getting very close to nailing that.

In my keynote @ SIGMOD last year <http://dl.acm.org/citation.cfm?doid=2723372.2723374>,
I talked about how theory (see papers above) has shown that with an extended relational algebraic
core these seemingly different applications converge to a platform that is powered by a relational
core. This converged platform is the long-term vision for Quickstep. Yup — I hear you, I
need to write this up for the community. You are right and I’m adding it to my list :-)

We have shown prototypes for all of the above, but haven’t put it all together. That is
the hard part, and we are at the start of that journey. That effort is also revealing all
kinds of interesting systems research issues — so good for the students on the project.
Potentially exciting times ahead!


> On Jun 14, 2016, at 2:32 PM, Julian Hyde <jhyde@apache.org> wrote:
> Having that representation reduces coupling in your architecture, so is useful even if
you don’t decide to use a library for SQL parsing/planning. But I think once you have it
you will realize that all of the interesting problems for the project happen after the query
has been converted to algebra.
> Julian

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message