hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Haindrich (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-18624) Parsing time is extremely high (~10 min) for queries with complex select expressions
Date Tue, 21 Aug 2018 15:17:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-18624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587574#comment-16587574
] 

Zoltan Haindrich commented on HIVE-18624:
-----------------------------------------

[~ashutoshc] Could you please take a look?

> Parsing time is extremely high (~10 min) for queries with complex select expressions
> ------------------------------------------------------------------------------------
>
>                 Key: HIVE-18624
>                 URL: https://issues.apache.org/jira/browse/HIVE-18624
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Parser
>    Affects Versions: 3.0.0, 2.3.2
>            Reporter: Amruth S
>            Assignee: Zoltan Haindrich
>            Priority: Major
>         Attachments: HIVE-18624.01.patch, thread_dump
>
>
> Explain of the same query takes
> 0.1 to 3 seconds in hive 2.1.0 &
>  10-15 min in hive 2.3.2 & latest master
> Sample expression below
> {code:java}
> EXPLAIN
> SELECT DISTINCT
>   IF(lower('a') <= lower('a')
>   ,'a'
>   ,IF(('a' IS NULL AND from_unixtime(UNIX_TIMESTAMP()) <= 'a')
>   ,'a'
>   ,IF(if('a' = 'a', TRUE, FALSE) = 1
>   ,'a'
>   ,IF(('a' = 1 and lower('a') NOT IN ('a', 'a')
>        and lower(if('a' = 'a','a','a')) <= lower('a'))
>       OR ('a' like 'a' OR 'a' like 'a')
>       OR 'a' in ('a','a')
>   ,'a'
>   ,IF(if(lower('a') in ('a', 'a') and 'a'='a', TRUE, FALSE) = 1
>   ,'a'
>   ,IF('a'='a' and unix_timestamp(if('a' = 'a',cast('a' as string),coalesce('a',cast('a'
as string),from_unixtime(unix_timestamp())))) <= unix_timestamp(concat_ws('a',cast(lower('a')
as string),'00:00:00')) + 9*3600
>   ,'a'
>   ,If(lower('a') <= lower('a')
>       and if(lower('a') in ('a', 'a') and 'a'<>'a', TRUE, FALSE) <> 1
>   ,'a'
>   ,IF('a'=1 AND 'a'=1
>   ,'a'
>   ,IF('a' = 1 and COALESCE(cast('a' as int),0) = 0
>   ,'a'
>   ,IF('a' = 'a'
>   ,'a'
>   ,If('a' = 'a' AND lower('a')>lower(if(lower('a')<1830,'a',cast(date_add('a',1)
as timestamp)))
>   ,'a'
>   ,IF('a' = 1
>   ,IF('a' in ('a', 'a') and ((unix_timestamp('a')-unix_timestamp('a')) / 60) > 30
and 'a' = 1
>   ,'a', 'a')
>   ,IF(if('a' = 'a', FALSE, TRUE ) = 1 AND 'a' IS NULL
>   ,'a'
>   ,IF('a' = 1 and 'a'>0
>   , 'a'
>   ,IF('a' = 1 AND 'a' ='a'
>   ,'a'
>   ,IF('a' is not null and 'a' is not null and 'a' > 'a'
>   ,'a'
>   ,IF('a' = 1
>   ,'a'
>   ,IF('a' = 'a'
>   ,'a'
>   ,If('a' = 1
>   ,'a'
>   ,IF('a' = 1
>   ,'a'
>   ,IF('a' = 1
>   ,'a'
>   ,IF('a' ='a' and 'a' ='a' and cast(unix_timestamp('a') as  int) + 93600 < cast(unix_timestamp()
 as int)
>   ,'a'
>   ,IF('a' = 'a'
>   ,'a'
>   ,IF('a' = 'a' and 'a' in ('a','a','a')
>   ,'a'
>   ,IF('a' = 'a'
>   ,'a','a'))
>       )))))))))))))))))))))))
> AS test_comp_exp
> {code}
>  
> Taking a look at [^thread_dump] shows a very large function stack getting created.
> Reverting HIVE-15578 (92f31d07aa988d4a460aac56e369bfa386361776) seem to speed up the
parsing.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message