lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joel Bernstein (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
Date Mon, 05 Dec 2016 16:15:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15722648#comment-15722648
] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:15 PM:
---------------------------------------------------------------

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm working through each
assertion in each method to understand the differences between the current release and the
work done in this patch, and making changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The current patch
doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the
tree is walked. I also changed some how the query is re-written to a Lucene/Solr query so
that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead
of using the *function signature* in the result set. It looks like we'll have to use the Caclite
expression identifiers going forward, which should be OK. I think this is cleaner anyway because
looking up fields by a function signature can get cumbersome. We'll just need to document
this in the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate queries. After
that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our implementation.
As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible
we'll have all the code in place to push this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm working through each
assertion in each method to understand the differences between the current release the work
done in this patch, and making changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The current pant
doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the
tree is walked. I also changed some how the query is re-written to a Lucene/Solr query so
that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead
of using the *function signature* in the result set. It looks like we'll have to use the Caclite
expression identifiers going forward, which should be OK. I think this is cleaner anyway because
looking up fields by a function signature can get cumbersome. We'll just need to document
this in the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate queries. After
that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our implementation.
As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible
we'll have all the code in place to push this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> --------------------------------------------
>
>                 Key: SOLR-8593
>                 URL: https://issues.apache.org/jira/browse/SOLR-8593
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>         Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>    The Presto SQL Parser was perfect for phase one of the SQLHandler. It was nicely split
off from the larger Presto project and it did everything that was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where Apache Calcite
comes into play. It has a battle tested cost based optimizer and has been integrated into
Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans will continue
to be translated to Streaming API objects (TupleStreams), so continued work on the JDBC driver
should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message