hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Markovitz, Dudu" <dmarkov...@paypal.com>
Subject RE: Optimized Hive query
Date Tue, 14 Jun 2016 19:01:20 GMT
1)
Cost-based optimization in Hive<https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive>
https://cwiki.apache.org/confluence/display/Hive/Cost-based+optimization+in+Hive

Calcite is an open source, Apache Licensed, query planning and execution framework. Many pieces
of Calcite are derived from Eigenbase Project. Calcite has optional JDBC server, query parser
and validator, query optimizer and pluggable data source adapters. One of the available Calcite
optimizer is a cost based optimizer based on volcano paper.

2)
The Volcano Optimizer Generator: Extensibility and Efficient Search
Goetz Graefe, Portland State University
William J. McKenna, University of Colorado at Boulder
From Proc. IEEE Conf. on Data Eng., Vienna, April 1993, p. 209.

2.2. Optimizer Generator Input and Optimizer Operation
…
The user queries to be optimized by a generated optimizer are specified as an algebra
expression (tree) of logical operators. The translation from a user interface into a logical
algebra
expression must be performed by the parser and is not discussed here.
…

3)
Abstract syntax tree
From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Abstract_syntax_tree

In computer science<https://en.wikipedia.org/wiki/Computer_science>, an abstract syntax
tree (AST), or just syntax tree, is a tree<https://en.wikipedia.org/wiki/Directed_tree>
representation of the abstract syntactic<https://en.wikipedia.org/wiki/Abstract_syntax>
structure of source code<https://en.wikipedia.org/wiki/Source_code> written in a programming
language<https://en.wikipedia.org/wiki/Programming_language>.


From: Mich Talebzadeh [mailto:mich.talebzadeh@gmail.com]
Sent: Tuesday, June 14, 2016 7:58 PM
To: user <user@hive.apache.org>
Subject: Re: Optimized Hive query

Amazing. that is the first time I have heard that an optimizer does not have the concept of
flattened query?

So what is the definition of syntax tree? Are you referring to the industry notation "access
path". This is the first time I have heard of such notation called syntax tree. Are you stating
that there is somehow some explanation for optimiser "access path" that comes out independent
of  the optimizer and is called syntax tree?




Dr Mich Talebzadeh



LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 14 June 2016 at 17:46, Markovitz, Dudu <dmarkovitz@paypal.com<mailto:dmarkovitz@paypal.com>>
wrote:
It’s not the query that is being optimized but the syntax tree that is created upon the
query (execute “explain extended select …”)
In no point do we have a “flattened query”

Dudu

From: Aviral Agarwal [mailto:aviral12028@gmail.com<mailto:aviral12028@gmail.com>]
Sent: Tuesday, June 14, 2016 10:37 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Optimized Hive query

Hi,
Thanks for the replies.
I already knew that the optimizer already does that.
My usecase is a bit different though.
I want to display the flattened query back to the user.
So I was hoping of using internal Hive CBO to somehow change the AST generated for the query
somehow.

Thanks,
Aviral

On Tue, Jun 14, 2016 at 12:42 PM, Gopal Vijayaraghavan <gopalv@apache.org<mailto:gopalv@apache.org>>
wrote:

> You can see that you get identical execution plans for the nested query
>and the flatten one.

Wasn't that always though. Back when I started with Hive, before Stinger,
it didn't have the identity project remover.

To know if your version has this fix, try looking at

hive> set hive.optimize.remove.identity.project;


Cheers,
Gopal




Mime
View raw message