Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id ED050200C52 for ; Mon, 10 Apr 2017 13:34:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id EBBEA160BA5; Mon, 10 Apr 2017 11:34:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 47A0A160B85 for ; Mon, 10 Apr 2017 13:34:46 +0200 (CEST) Received: (qmail 33038 invoked by uid 500); 10 Apr 2017 11:34:45 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 33020 invoked by uid 99); 10 Apr 2017 11:34:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Apr 2017 11:34:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2440C1A0584 for ; Mon, 10 Apr 2017 11:34:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id MJbJDYVfIk2W for ; Mon, 10 Apr 2017 11:34:44 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id C18865FC84 for ; Mon, 10 Apr 2017 11:34:43 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id F0231E0A17 for ; Mon, 10 Apr 2017 11:34:42 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id AA9962406F for ; Mon, 10 Apr 2017 11:34:41 +0000 (UTC) Date: Mon, 10 Apr 2017 11:34:41 +0000 (UTC) From: "Ravi Teja Chilukuri (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-16414) [Hive on Tez] Hive Union queries resource efficiency less on Tez than Mapreduce MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 10 Apr 2017 11:34:47 -0000 [ https://issues.apache.org/jira/browse/HIVE-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Teja Chilukuri updated HIVE-16414: --------------------------------------- Summary: [Hive on Tez] Hive Union queries resource efficiency less on Tez than Mapreduce (was: [Hive on Tez] Union queries resources efficiency less on Tez than Mapreduce) > [Hive on Tez] Hive Union queries resource efficiency less on Tez than Mapreduce > ------------------------------------------------------------------------------- > > Key: HIVE-16414 > URL: https://issues.apache.org/jira/browse/HIVE-16414 > Project: Hive > Issue Type: Bug > Components: Tez > Affects Versions: 2.1.0 > Reporter: Ravi Teja Chilukuri > > When a hive union query with the sub queries reading the same table is run in Mapreduce and tez, Mapreduce reads the table only once, no matter how many reads on the same table are present, > but tez reads the same table multiple times in the form of multiple vertices. > If a table is to be read by X mappers, > Tez runs with kX map tasks where k is the number of sub queries reading from the same table and > Mapreduce runs with X mappers no matter how many sub queries are present. > For such union queries, we need to fall back to MR instead of TEZ. > *Query:* > http://pastebin.com/t6n91u6a > *Tez explain plan:* > http://pastebin.com/aWwVxhii > *MR explain plan:* > http://pastebin.com/iDbWwtKR -- This message was sent by Atlassian JIRA (v6.3.15#6346)