hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-13255) FloatTreeReader.nextVector is expensive
Date Wed, 30 Mar 2016 21:30:25 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218897#comment-15218897
] 

Prasanth Jayachandran commented on HIVE-13255:
----------------------------------------------

It definitely needs <0 check. But might not need the looping until requested size satisfied
logic. Since we are passing the InputStream instead of InStream (does not need looping logic),
I added that as well to the latest patch. Branch prediction + inlining should mitigate the
branch cost. 

> FloatTreeReader.nextVector is expensive 
> ----------------------------------------
>
>                 Key: HIVE-13255
>                 URL: https://issues.apache.org/jira/browse/HIVE-13255
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC
>    Affects Versions: 2.1.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-13255.1.patch, HIVE-13255.2.patch, bytecode-size-after.png,
bytecode-size-before.png, float-reader-perf.png, q1-bottleneck.png, q1-warm-perf-map.png
>
>
> Some TPCDS queries on 1TB scale shows FloatTreeReader on profile samples. It is most
likely because of multiple branching and polymorphic dispatch in FloatTreeReader.nextVector()
implementation. See attached image for sampling profile output.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message