hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11882) Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold
Date Mon, 21 Sep 2015 22:12:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14901504#comment-14901504
] 

Gopal V commented on HIVE-11882:
--------------------------------

That looks like a missed case - assigned to you.

> Fetch optimizer should stop source files traversal once it exceeds the hive.fetch.task.conversion.threshold
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-11882
>                 URL: https://issues.apache.org/jira/browse/HIVE-11882
>             Project: Hive
>          Issue Type: Improvement
>          Components: Physical Optimizer
>    Affects Versions: 1.0.0
>            Reporter: Illya Yalovyy
>            Assignee: Illya Yalovyy
>
> Hive 1.0's fetch optimizer tries to optimize queries of the form "select <C> from
<T> where <F> limit <L>" to a fetch task (see the hive.fetch.task.conversion
property). This optimization gets the lengths of all the files in the specified partition
and does some comparison against a threshold value to determine whether it should use a fetch
task or not (see the hive.fetch.task.conversion.threshold property). This process of getting
the length of all files. One of the main problems in this optimization is the fetch optimizer
doesn't seem to stop once it exceeds the hive.fetch.task.conversion.threshold. It works fine
on HDFS, but could cause a significant performance degradation on other supported file systems.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message