hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thai Bui <blquyt...@gmail.com>
Subject Hive index + Tez engine = no performance gain?!
Date Tue, 22 Aug 2017 01:22:16 GMT
This seems out of the blue but my initial benchmarks have shown that
there's no performance gain when Hive index is used with Tez engine. I'm
not sure why, but several posts online have suggested that Tez engine does
not support Hive index (bitmap, compact). Is true? If yes, that is sad.

I understand that ORC format is a much better alternative if you manage
your own tables. However, at my company, we have several teams that pick
our own technology and thus, most teams would use Parquet due to its ease
of integrations with various external systems.

Nonetheless, we still want to have fast ad-hoc query via Hive LLAP / Tez. I
think that index is a perfect solution for non-ORC file format since you
can selectively build an index table and leverage Tez to only look at those
blocks and/or files that we need to scan.

Thanks for any input,
Thai

Mime
View raw message