hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5771) Constant propagation optimizer for Hive
Date Sun, 29 Jun 2014 09:41:24 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047086#comment-14047086
] 

Ted Xu commented on HIVE-5771:
------------------------------

Hi [~ashutoshc], thanks for looking into that issue. I don't have enough context on HIVE-7232,
but the issue is still there after HIVE-7232 is patched.

I looked into subquery_views.q, it seems there is an extra filter predicate which breaks the
query. Notice the following hive.log segment:

{code}
2014-06-29 01:20:00,662 INFO  ppd.OpProcFactory (OpProcFactory.java:process(209)) - Processing
for FIL(37)
2014-06-29 01:20:00,663 INFO  ppd.OpProcFactory (OpProcFactory.java:logExpr(601)) - Pushdown
Predicates of FIL For Alias : sq_1_notin_nullcheck
2014-06-29 01:20:00,663 INFO  ppd.OpProcFactory (OpProcFactory.java:logExpr(604)) -   (_col0
= 0)
2014-06-29 01:20:00,663 INFO  ppd.OpProcFactory (OpProcFactory.java:process(549)) - Processing
for SEL(36)
2014-06-29 01:20:00,663 INFO  ppd.OpProcFactory (OpProcFactory.java:logExpr(601)) - Pushdown
Predicates of SEL For Alias : sq_1_notin_nullcheck
2014-06-29 01:20:00,663 INFO  ppd.OpProcFactory (OpProcFactory.java:logExpr(604)) -   (_col0
= 0)
2014-06-29 01:20:00,663 INFO  ppd.OpProcFactory (OpProcFactory.java:process(549)) - Processing
for GBY(35)
2014-06-29 01:20:00,664 INFO  ppd.OpProcFactory (OpProcFactory.java:process(549)) - Processing
for RS(34)
2014-06-29 01:20:00,666 INFO  ppd.OpProcFactory (OpProcFactory.java:process(549)) - Processing
for GBY(33)
2014-06-29 01:20:00,667 INFO  ppd.OpProcFactory (OpProcFactory.java:process(549)) - Processing
for SEL(32)
2014-06-29 01:20:00,667 INFO  ppd.OpProcFactory (OpProcFactory.java:process(209)) - Processing
for FIL(31)
2014-06-29 01:20:00,667 INFO  ppd.OpProcFactory (OpProcFactory.java:logExpr(601)) - Pushdown
Predicates of FIL For Alias : sq_1
2014-06-29 01:20:00,667 INFO  ppd.OpProcFactory (OpProcFactory.java:logExpr(604)) -   ((_col0
is null or _col1 is null) or _col2 is null)
{code}

FIL37 has a constant predicate (_col0 = 0), which is supposed to be predicated to following
operators, which breaks FIL31. The query don't contain such predicate, I'm not sure if it
is introduced by exist in clause.

> Constant propagation optimizer for Hive
> ---------------------------------------
>
>                 Key: HIVE-5771
>                 URL: https://issues.apache.org/jira/browse/HIVE-5771
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Ted Xu
>            Assignee: Ted Xu
>         Attachments: HIVE-5771.1.patch, HIVE-5771.10.patch, HIVE-5771.11.patch, HIVE-5771.12.patch,
HIVE-5771.2.patch, HIVE-5771.3.patch, HIVE-5771.4.patch, HIVE-5771.5.patch, HIVE-5771.6.patch,
HIVE-5771.7.patch, HIVE-5771.8.patch, HIVE-5771.9.patch, HIVE-5771.patch, HIVE-5771.patch.javaonly
>
>
> Currently there is no constant folding/propagation optimizer, all expressions are evaluated
at runtime. 
> HIVE-2470 did a great job on evaluating constants on UDF initializing phase, however,
it is still a runtime evaluation and it doesn't propagate constants from a subquery to outside.
> It may reduce I/O and accelerate process if we introduce such an optimizer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message