hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15788) Implement FastBloomFilter to use RoaringBitmap instead of long[]
Date Mon, 24 Apr 2017 07:02:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15980780#comment-15980780
] 

Prasanth Jayachandran commented on HIVE-15788:
----------------------------------------------

Last time when I did JMH benchmark RoaringBitmap wasn't fast enough and that's the reason
why I did not use that in first place. I still doubt that this will be faster. I am sure this
will have better compression but it will have huge performance hit. I guess [~jdere] also
benchmarked it recently and concluded the same (correct me if I am wrong). So I am -1 on replacing
the default long[] until proven otherwise.

> Implement FastBloomFilter to use RoaringBitmap instead of long[] 
> -----------------------------------------------------------------
>
>                 Key: HIVE-15788
>                 URL: https://issues.apache.org/jira/browse/HIVE-15788
>             Project: Hive
>          Issue Type: Improvement
>          Components: UDF
>            Reporter: Gopal V
>            Assignee: Murali Vemulapati
>         Attachments: HIVE-15788.patch
>
>
> Currently, a bloom filter which is all 1s occupies the exact amount of space as a bloom
filter which is sparse.
> This is an entire waste of space and produces memory pressure and generate a massive
number of cache misses.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message