hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teddy Choi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-20985) If select operator inputs are temporary columns vectorization may reuse some of them as output
Date Fri, 30 Nov 2018 05:44:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-20985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704292#comment-16704292
] 

Teddy Choi commented on HIVE-20985:
-----------------------------------

Looks good on correctness.

There are some cases that lose memory efficiency more than others. Of course, vector_reuse_scratchcols.q
does it on purpose, but CASE and nested math operations are pretty common cases and hard
to ignore.

 

Table: the maximum column number in a map or reduce operation.
||File||Before||After||
|vector_case_when_1.q|38|68|
|vector_case_when_2.q|17|39|
|vector_decimal_aggregate.q|25|46|
|vector_decimal_expressions.q|22|38|
|vector_decimal_math_funcs.q|31|80|
|vector_interval_2.q|32|62|
|vector_reuse_scratchcols.q|44|162|

Usually, a column vector has 1,024 elements of long or double, which are 8 byte. So a
column vector will take 8 KB+. vector_reuse_scratchcols.q will use 118 column vectors more,
which is 0.6 MB+. The additional memory use will be on every executor. Even with 32-core machine,
it will take 19.2 MB+ more.

However, it may lose performance advantage from CPU caches, I guess.

> If select operator inputs are temporary columns vectorization may reuse some of them
as output
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20985
>                 URL: https://issues.apache.org/jira/browse/HIVE-20985
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>         Attachments: HIVE-20985.01.patch, HIVE-20985.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message