hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <>
Subject Re: Hive external indexes incorporation into Hive CBO
Date Thu, 21 Apr 2016 11:34:10 GMT
I am still not sure why you think they are not used. The main issue is that the block size
is usually very large (eg 256 MB compared to kilobytes / sometimes few megabytes in traditional
databases) and the indexes refer to blocks. This makes it less likely that you can leverage
it for small datasets in the area of GBs, they make more sense for TB, PB, except if you reduce
the blocksize. However, given ORC, the traditional indexes make probably less sense. I tried
to give an overview here:
However comments to improve it are very welcome.

Sent from my iPhone
> On 21 Apr 2016, at 12:02, Mich Talebzadeh <> wrote:
> Hi,
> As we have discussed this few times, Hive external indexes (as opposed to Store Indexes
in ORC tables) are there but are not currently utilised.
> For Hive to be effective it needs to use these indexes for a variety of reasons and the
CBO should leverage these indexes.
> I am not sure how far we are down the road with Work In Progress on this. However, I
am happy to help with this.
> Dr Mich Talebzadeh
> LinkedIn

View raw message