hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16151) BytesBytesHashTable allocates large arrays
Date Wed, 05 Apr 2017 19:25:41 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957494#comment-15957494
] 

Gopal V commented on HIVE-16151:
--------------------------------

This came up as a ~4% performance in an extra null check, but does allow for larger hash tables.

LGTM - +1.

> BytesBytesHashTable allocates large arrays
> ------------------------------------------
>
>                 Key: HIVE-16151
>                 URL: https://issues.apache.org/jira/browse/HIVE-16151
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Prasanth Jayachandran
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-16151.patch
>
>
> These arrays cause GC pressure and also impose key count limitations on the table. Wrt
the latter, we won't be able to get rid of it without a 64-bit hash function, but for now
we can get rid of the former. If we need the latter we'd add murmur64 and probably account
for it differently for resize (we don't want to blow up the hashtable by 4 bytes/key in the
common case where #of keys is less than ~1.5B :))



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message