hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Teja Chilukuri (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-16414) [Hive on Tez] Hive Union queries resource efficiency less on Tez than Mapreduce
Date Mon, 10 Apr 2017 11:34:41 GMT

     [ https://issues.apache.org/jira/browse/HIVE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ravi Teja Chilukuri updated HIVE-16414:
---------------------------------------
    Summary: [Hive on Tez] Hive Union queries resource efficiency less on Tez than Mapreduce
 (was: [Hive on Tez] Union queries resources efficiency less on Tez than Mapreduce)

> [Hive on Tez] Hive Union queries resource efficiency less on Tez than Mapreduce
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-16414
>                 URL: https://issues.apache.org/jira/browse/HIVE-16414
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 2.1.0
>            Reporter: Ravi Teja Chilukuri
>
> When a hive union query with the sub queries reading the same table is run in Mapreduce
and tez, Mapreduce reads the table only once, no matter how many reads on the same table are
present,
> but tez reads the same table multiple times in the form of multiple vertices.
> If a table is to be read by X mappers,
> Tez runs with kX map tasks where k is the number of sub queries reading from the same
table and 
> Mapreduce runs with X mappers no matter how many sub queries are present.
> For such union queries, we need to fall back to MR instead of TEZ.
> *Query:*
> http://pastebin.com/t6n91u6a
> *Tez explain plan:*
> http://pastebin.com/aWwVxhii
> *MR explain plan:*
> http://pastebin.com/iDbWwtKR



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message