hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Min-Max Index vs Bloom filter
Date Mon, 02 Nov 2015 19:56:04 GMT
Bloom Filter only works for = and min max for <>= , however the latter only works for
numeric value while the bloom filter nearly works on all types. Additionally the bloom filter
is a probabilistic data structure.
For both it make sense that the data is sorted on the column which is most selective in a
where clause or at least on all columns that can be meaningfully sorted together , i.e. Where
the data in the columns is sorted. If this is not the case it make sense to have two different
tables sorted differently depending on the query. All in all you need a good understanding
of the data. 

> On 02 Nov 2015, at 20:23, patcharee <Patcharee.Thongtra@uni.no> wrote:
> 
> Hi,
> 
> For the orc format, which scenario that bloom filter is better than min-max index?
> 
> Best,
> Patcharee

Mime
View raw message