hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teddy Choi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-20873) Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
Date Tue, 06 Nov 2018 18:00:00 GMT
Teddy Choi created HIVE-20873:
---------------------------------

             Summary: Use Murmur hash for VectorHashKeyWrapperTwoLong to reduce hash collision
                 Key: HIVE-20873
                 URL: https://issues.apache.org/jira/browse/HIVE-20873
             Project: Hive
          Issue Type: Improvement
            Reporter: Teddy Choi
            Assignee: Teddy Choi


VectorHashKeyWrapperTwoLong is implemented with few bit shift operators and XOR operators
for short computation time, but more hash collision. Group by operations become very slow
on large data sets. It needs Murmur hash or a better hash function for less hash collision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message