drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3921) Hive LIMIT 1 queries take too long
Date Mon, 26 Oct 2015 15:41:27 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974406#comment-14974406
] 

ASF GitHub Bot commented on DRILL-3921:
---------------------------------------

Github user sudheeshkatkam commented on the pull request:

    https://github.com/apache/drill/pull/197#issuecomment-151178107
  
    The same query, on a cluster running both Hive and Drill, takes ~24 seconds to plan. Majority
of this time is spent in converting from Calcite logical plan to Drill logical plan, which
includes creating Hive scan configurations. I haven not done a detailed analysis as to where
the time is spent (~1 millisecond for each reader), but that is another issue.


> Hive LIMIT 1 queries take too long
> ----------------------------------
>
>                 Key: DRILL-3921
>                 URL: https://issues.apache.org/jira/browse/DRILL-3921
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>            Reporter: Sudheesh Katkam
>            Assignee: Sudheesh Katkam
>
> Fragment initialization on a Hive table (that is backed by a directory of many files)
can take really long. This is evident through LIMIT 1 queries. The root cause is that the
underlying reader in the HiveRecordReader is initialized when the ctor is called, rather than
when setup is called.
> Two changes need to be made:
> 1) lazily initialize the underlying record reader in HiveRecordReader
> 2) allow for running a callable as a proxy user within an operator (through OperatorContext).
This is required as initialization of the underlying record reader needs to be done as a proxy
user (proxy for owner of the file). Previously, this was handled while creating the record
batch tree.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message