drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Hsuan-Yi Chu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4577) Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in
Date Fri, 29 Apr 2016 17:02:13 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264337#comment-15264337

Sean Hsuan-Yi Chu commented on DRILL-4577:

Here are a few points. Can you help see if they make sense for checking in the code?
1. [Perf-wise] Without bulk loading, the performance is just NOT acceptable.
2. [Protection by adding an option] By default, users would see the behavior which they used
to have.
3. [Comparisons with other systems] I am not sure of whether showing just the table names
is very harmful. For instance, when you type in "show tables" in hive, hive will give all
the table names, regardless of the permissions.

> Improve performance for query on INFORMATION_SCHEMA when HIVE is plugged in
> ---------------------------------------------------------------------------
>                 Key: DRILL-4577
>                 URL: https://issues.apache.org/jira/browse/DRILL-4577
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive
>            Reporter: Sean Hsuan-Yi Chu
>            Assignee: Sean Hsuan-Yi Chu
>             Fix For: 1.7.0
> A query such as 
> {code}
> {code}
> is converted as calls to fetch all tables from storage plugins. 
> When users have Hive, the calls to hive metadata storage would be: 
> 1) get_table
> 2) get_partitions
> However, the information regarding partitions is not used in this type of queries. Beside,
a more efficient way is to fetch tables is to use get_multi_table call.

This message was sent by Atlassian JIRA

View raw message