hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-15065) SimpleFetchOptimizer should decide based on metastore stats when available
Date Wed, 26 Oct 2016 02:16:58 GMT

     [ https://issues.apache.org/jira/browse/HIVE-15065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Prasanth Jayachandran updated HIVE-15065:
-----------------------------------------
    Status: Patch Available  (was: Open)

> SimpleFetchOptimizer should decide based on metastore stats when available
> --------------------------------------------------------------------------
>
>                 Key: HIVE-15065
>                 URL: https://issues.apache.org/jira/browse/HIVE-15065
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 2.2.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-15065.1.patch
>
>
> Currently the decision to use fetch optimizer or not is based on scanning the filesystem
for file lengths and see if the aggregated size is less the fetch task threshold. This can
be very expensive for cloud environment. This issue is mitigated to some extent by HIVE-14920
but still that requires file system scan. We can make decision based on the stats from metastore
and falling back when stats is not available. Since fast stats (numRows and fileSize) is always
available this should work most of the time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message