drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5972) Slow performance for query on INFORMATION_SCHEMA.TABLE
Date Wed, 29 Nov 2017 11:17:02 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16270608#comment-16270608
] 

ASF GitHub Bot commented on DRILL-5972:
---------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/1038


> Slow performance for query on INFORMATION_SCHEMA.TABLE
> ------------------------------------------------------
>
>                 Key: DRILL-5972
>                 URL: https://issues.apache.org/jira/browse/DRILL-5972
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Information Schema
>    Affects Versions: 1.11.0
>            Reporter: Padma Penumarthy
>            Assignee: Padma Penumarthy
>              Labels: ready-to-commit
>             Fix For: 1.12.0
>
>
> A query like the following on INFORMATION_SCHEMA takes a long time to execute. 
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from INFORMATION_SCHEMA.`TABLES`
WHERE TABLE_NAME LIKE '%' AND ( TABLE_SCHEMA = 'hive.default' ) ORDER BY TABLE_TYPE, TABLE_CATALOG,
TABLE_SCHEMA, TABLE_NAME; 
> Reason being we fetch table information for all schemas instead of just 'hive.default'
schema.
> If we  change the predicate like this, it executes very fast.
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from INFORMATION_SCHEMA.`TABLES`
WHERE  ( TABLE_SCHEMA = 'hive.default' ) AND TABLE_NAME LIKE '%'  ORDER BY TABLE_TYPE, TABLE_CATALOG,
TABLE_SCHEMA, TABLE_NAME; 
> The difference is in the order in which we evaluate the expressions in the predicate.
> In the first case,  we first evaluate TABLE_NAME LIKE '%' and decide that it is inconclusive
(since we do not know the schema). So, we go get all tables for all the schemas.
> In the second case, we first evaluate  TABLE_SCHEMA = 'hive.default' and decide that
we need to fetch only tables for that schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message