hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <>
Subject Re: When using ORC, how to confirm Stipes/row-groups are being skipped ?
Date Thu, 04 Jun 2015 18:33:49 GMT

> While using ORC file format, I would like to see in the logs that
>stripes and/or row-groups are being skipped based on my where clause.

There¹s no logging in the inner loop there.

> Is that info even outputted ? If so, what do I need to enable it ?

You can do a query run with the following to see the difference.

hive> set hive.tez.print.exec.summary=true;
hive> set hive.optimize.index.filter=false;
// run query 
hive> set hive.optimize.index.filter=true;
// run query

You¹ll get numbers which will indicate how much row-filtering is
happening, since the input records count for the vertex will track the
actual records read off ORC.

For an example of what that does, see

If you have hive-1.2.0 builds, then you can also try setting the
TBLPROPERTIES for orc.bloom.filter.columns to use the new row indexes as

For Strings, that should work much better than the current min-max index.


View raw message