hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Chauhan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-5771) Constant propagation optimizer for Hive
Date Sun, 19 Jan 2014 19:32:21 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13875991#comment-13875991
] 

Ashutosh Chauhan edited comment on HIVE-5771 at 1/19/14 7:32 PM:
-----------------------------------------------------------------

Pretty good work, Ted. Hive is in need of this optimization for long time. Thanks for taking
it up.
I scanned the patch. Mostly looking at .q.out changes. Most of them look are correct, except
following :

* smb_mapjoin_18.q : Seems like a Map only job has turned into MR job. 
* smb_mapjoin_25.q  : extra MR stage got introduced
* groupby_sort_1.q --> extra MR stage got introduced
* groupby_sort_skew_1.q --> extra MR stage got introduced
* udf_between.q --> betweeen 2 and '3' got optimized away. Here types don't match, shouldn't
this instead have optimized into always false filter?
* decimal.q - optimization is turned off. Any particular reason?
* pcr.q - optimization is turned off. Any particular reason?

I haven't looked at code changes yet. Will be looking at those soon.


was (Author: ashutoshc):
Pretty good work, Ted. Hive is in need of this optimization for long time. Thanks for taking
it up.
I scanned the patch. Mostly looking at .q.out changes. Most of them look are correct, except
following :

* smb_mapjoin_18.q : Seems like a Map only job has turned into MR job. 
* smb_mapjoin_25.q  : extra MR stage got introduced
groupby_sort_1.q --> extra MR stage got introduced
groupby_sort_skew_1.q --> extra MR stage got introduced

udf_between.q --> betweeen 2 and '3' got optimized away. Here types don't match, shouldn't
this instead have optimized into always false filter?
decimal.q - optimization is turned off. Any particular reason?
pcr.q - optimization is turned off. Any particular reason?

I haven't looked at code changes yet. Will be looking at those soon.

> Constant propagation optimizer for Hive
> ---------------------------------------
>
>                 Key: HIVE-5771
>                 URL: https://issues.apache.org/jira/browse/HIVE-5771
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: HIVE-5771.1.patch, HIVE-5771.2.patch, HIVE-5771.3.patch, HIVE-5771.4.patch,
HIVE-5771.5.patch, HIVE-5771.6.patch, HIVE-5771.patch
>
>
> Currently there is no constant folding/propagation optimizer, all expressions are evaluated
at runtime. 
> HIVE-2470 did a great job on evaluating constants on UDF initializing phase, however,
it is still a runtime evaluation and it doesn't propagate constants from a subquery to outside.
> It may reduce I/O and accelerate process if we introduce such an optimizer.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message