Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5EBD1200AF7 for ; Tue, 14 Jun 2016 10:04:41 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5D630160A47; Tue, 14 Jun 2016 08:04:41 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7C8F11602C5 for ; Tue, 14 Jun 2016 10:04:40 +0200 (CEST) Received: (qmail 87135 invoked by uid 500); 14 Jun 2016 08:04:36 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 87064 invoked by uid 99); 14 Jun 2016 08:04:36 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Jun 2016 08:04:36 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 2B527C1EB2 for ; Tue, 14 Jun 2016 08:04:36 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.198 X-Spam-Level: * X-Spam-Status: No, score=1.198 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id TK4NqN5la0pm for ; Tue, 14 Jun 2016 08:04:33 +0000 (UTC) Received: from mail-qk0-f176.google.com (mail-qk0-f176.google.com [209.85.220.176]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id E9A195F1F7 for ; Tue, 14 Jun 2016 08:04:32 +0000 (UTC) Received: by mail-qk0-f176.google.com with SMTP id p10so52227492qke.3 for ; Tue, 14 Jun 2016 01:04:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=stoKLW6nvX+hH69xthaydaK2aUeTOPJeYyflctLlA7E=; b=f0jlwhq2xUhce6ytBQERYOZtImwK39E8PXYiHm3HZ6aYtgfLQB4qsw9Cbs4V0OhXxw bY+0RrRSpxa39jBRMzvQiksXHqtDNAnIwKhF7X+okftFRbVhnZTb1o6ipyrPvZj0qcW2 R61edqndeQO9gyLd4c6XB2yECj0Yi1bbzLxccZylHmgYaH5Koxv69eFe3JAY6tKQl6Ut jneREd+xjXacj06XIr/yIQbb4s6FxsChANmJH/6WgBYaX9iH12kdRA0KV4GWsMzAEqTW uZj/aVGhyKWjIMtXpG+sPfzjz+A2eFxFqAA61xFckqTWgoW6peTXazwPiiIf+reACy+h PnCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=stoKLW6nvX+hH69xthaydaK2aUeTOPJeYyflctLlA7E=; b=PFHHkg/KEsvzxQ3PpcAj4HpVnsUHnVfWyZ1Ukuz+CkhVKeBEPzmpa4wAV2Ks35Ozcv nceyR8Am8bdqH5UHJW/3IMBgpoDZvbyoOEGKZ8lJVuio/PRzu4FKb5LlbCcT1ZukPgEv 9mjxH3SDIBTNww0rZcN9um77750Jshiw1gHI4ZDH/tQcNf1PhlCkbcPQYv4Y3Cc0wyFk iyFNisWDNPWmuNsGXO1FWXcPye+Q2PafnI90cZ8vVJsqM0SmzbgYlOwEJ8fgIuU4vEtD NP0c2kRd0MXNgqbyZX2VOBwbgjBxxSMwOwnfWAjaGaIrZydlQyCVt2Efz4PuSpfglFm4 zs9Q== X-Gm-Message-State: ALyK8tKA3m93sprFfWuE39M2vgbEWIp4ig2aAD1YE4eAs3rATCt6o3Vny1srHMkXKseECbJO7ZxP46WhIM21FA== X-Received: by 10.55.200.152 with SMTP id t24mr19312307qkl.197.1465891471920; Tue, 14 Jun 2016 01:04:31 -0700 (PDT) MIME-Version: 1.0 Received: by 10.55.126.131 with HTTP; Tue, 14 Jun 2016 01:04:31 -0700 (PDT) In-Reply-To: References: From: Mich Talebzadeh Date: Tue, 14 Jun 2016 09:04:31 +0100 Message-ID: Subject: Re: Optimized Hive query To: user Content-Type: multipart/alternative; boundary=001a11374d8e769196053538746b archived-at: Tue, 14 Jun 2016 08:04:41 -0000 --001a11374d8e769196053538746b Content-Type: text/plain; charset=UTF-8 I presume the user is concerned with performance? The whole use case of a CBO is to take care of queries by finding the optimum access path. otherwise we would have a RBO as is in the old days of Hive. If you are in the more recent version of Hive CBO does the job. However, you may think of moving from map-reduce execution engine to something like Spark to accelerate the whole thing. Alternatively use Spark for the query on Hive (assuming that you are familiar with the product) to do the whole thing (CBO + execution). Hive is pretty mature. Hive on map-reduce is problematic. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 14 June 2016 at 08:37, Aviral Agarwal wrote: > Hi, > Thanks for the replies. > I already knew that the optimizer already does that. > My usecase is a bit different though. > I want to display the flattened query back to the user. > So I was hoping of using internal Hive CBO to somehow change the AST > generated for the query somehow. > > Thanks, > Aviral > > On Tue, Jun 14, 2016 at 12:42 PM, Gopal Vijayaraghavan > wrote: > >> >> > You can see that you get identical execution plans for the nested query >> >and the flatten one. >> >> Wasn't that always though. Back when I started with Hive, before Stinger, >> it didn't have the identity project remover. >> >> To know if your version has this fix, try looking at >> >> hive> set hive.optimize.remove.identity.project; >> >> >> Cheers, >> Gopal >> >> >> >> >> > --001a11374d8e769196053538746b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

I presume the user is concerned with p= erformance?=C2=A0

The whole use case of a CBO is t= o take care of queries by=C2=A0finding the optimum access path.
<= br>
otherwise we would have a RBO as is in the old days of Hive.<= /div>

If you are in the more recent version of Hive CBO = does the job.

However, you may think of moving fro= m map-reduce execution engine to something like Spark to accelerate the who= le thing.

Alternatively use Spark for the query on= Hive (assuming that you are familiar with the product) to do the whole thi= ng (CBO + execution).

Hive is pretty mature. Hive = on map-reduce is problematic.

HTH
=C2=A0=


On 14 June 2016 at 08:37, Aviral Agarwal <aviral12028@gmail.com> wrote:
Hi,
Thanks for the replies.
I already = knew that the optimizer already does that.
My usecase is a bit di= fferent though.=C2=A0
I want to display the flattened query back = to the user.=C2=A0
So I was hoping of using internal Hive CBO to = somehow change the AST generated for the query somehow.

Thanks,
Aviral

On Tue, J= un 14, 2016 at 12:42 PM, Gopal Vijayaraghavan <gopalv@apache.org> wrote:

> You can see that you get identical execution plans for the nested quer= y
>and the flatten one.

Wasn't that always though. Back when I started with Hive, before= Stinger,
it didn't have the identity project remover.

To know if your version has this fix, try looking at

hive> set hive.optimize.remove.identity.project;


Cheers,
Gopal






--001a11374d8e769196053538746b--