calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adeel Qureshi <adeelmahm...@gmail.com>
Subject Integrating calcite into a sql processing pipeline
Date Sat, 04 Jul 2015 23:19:02 GMT
I have gone through the documentation provided at the calcite website and
the gist of what I got out of it was that I can pass a SQL statement to the
calcite engine/framework and tell it how to read my data using the
SchemaFactory, Schema, Table and Enumerator implementation classes and it
will be able to apply that query to my dataset and return the results of
that query. This makes sense and works as long as you want to completely
hand off the responsibility of processing the SQL to calcite but there are
cases where you want to control the processing pipeline and potentially do
things differently to process the SQL while still allowing calcite to do
the heavy lifting in terms of processing SQL. Here are some examples

1. My limited understanding of apache drill's query execution process is
that it uses calcite to come up with the logical plan for a SQL query and I
am not sure if it constructs the physical plan as well using calcite or
that's something internal to drill. Either way the physical plan is then
distributed into fragments of SQL for different drill nodes to process.
Without getting too much into how drill works, I am mostly interested in
how drill intervenes the calcite processing of taking the SQL and applying
it directly on the data and returning results. Its almost like they only
use calcite to come up with the plan and then take it from there. Are those
APIs exposed and can someone point me to where I can find such code within
calcite project. Basically to be able to control the complete process of
taking SQL, coming up with logical or physical plan and then applying it on
some dataset.

2. Another example is Hive which uses calcite to come up with cost based
logical optimizer and essentially integrates that into its flow of
processing the SQL instead of passing the SQL to calcite and letting it do
all the processing. This seems to be the pattern how calcite project is
being used by these other projects (in somewhat in direct way, not
completely handing off the responsibility of processing SQL) but I have not
been able to find any information on calcite site that can help with
incorporating calcite into other projects.

I would appreciate some insight into the matter. Thanks.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message