drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3180) Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and Netezza from Apache Drill
Date Wed, 16 Sep 2015 16:49:46 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14790714#comment-14790714
] 

Jacques Nadeau commented on DRILL-3180:
---------------------------------------

I think you may be depending on a behavior of particular dbs. The opinions of [~julianhyde],
[~jni] and [~amansinha100] would probably be helpful here. If my query has an *INNER* join
with an additional single-table join-local filter condition, then all of these are logically
equivalent:

- filter condition applied as part of join evaluation
- filter applied after join evaluation
- filter applied before join evaluation

As such, In Drill we should be able to rewrite to any of those and things should be ok. Additionally,
a derived table expressed in the same query also does not force/guarantee the ordering of
operations. The optimizers purpose to is to find all equivalent sets and then pick what it
thinks is the best one. If you can force the optimizer to order operations implicitly, that
would mean less good SQL writers would compose bad SQL and the optimizer couldn't do anything
about it.

That being said, the conversation of logical equivalencies is really separate from what you
really want: push down the join even their if there is an additional filter condition within
the join. That seems reasonable and could be done on the Calcite project, specifically right
here: https://github.com/apache/incubator-calcite/blob/master/core/src/main/java/org/apache/calcite/adapter/jdbc/JdbcRules.java#L201

Please note, that doesn't mean that you should expect a particular sql construction to be
passed to the underlying jdbc system. This is because the expression goes through the Calcite
optimizer. So while you may compose the query with a filter as part of the join condition,
that Drill may output the query to the JDBC source using a different but equal pattern. This
should be expected as Drill should produce a logically equivalent dataset.

This all changes if the join with the condition is an OUTER join.

> Apache Drill JDBC storage plugin to query rdbms systems such as MySQL and Netezza from
Apache Drill
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3180
>                 URL: https://issues.apache.org/jira/browse/DRILL-3180
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Storage - Other
>    Affects Versions: 1.0.0
>            Reporter: Magnus Pierre
>            Assignee: Jacques Nadeau
>              Labels: Drill, JDBC, plugin
>             Fix For: 1.2.0
>
>         Attachments: patch.diff, pom.xml, storage-mpjdbc.zip
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I have developed the base code for a JDBC storage-plugin for Apache Drill. The code is
primitive but consitutes a good starting point for further coding. Today it provides primitive
support for SELECT against RDBMS with JDBC. 
> The goal is to provide complete SELECT support against RDBMS with push down capabilities.
> Currently the code is using standard JDBC classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message