hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-1721) use bloom filters to improve the performance of map joins
Date Sat, 16 Oct 2010 00:54:35 GMT
use bloom filters to improve the performance of map joins
---------------------------------------------------------

                 Key: HIVE-1721
                 URL: https://issues.apache.org/jira/browse/HIVE-1721
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
            Reporter: Namit Jain
            Assignee: Liyin Tang


In case of map-joins, it is likely that the big table will not find many matching rows from
the small table.
Currently, we perform a hash-map lookup for every row in the big table, which can be pretty
expensive.

It might be useful to try out a bloom-filter containing all the elements in the small table.
Each element from the big table is first searched in the bloom filter, and only in case of
a positive match,
the small table hash table is explored.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message