hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yue Liu <aimagol...@gmail.com>
Subject How to use the index in Parquet to improve the query
Date Sat, 11 Jul 2015 03:42:47 GMT
Hi, All,

I am using Hive-1.2.1, and store table as Parquet. Now I have a query as
below:

select count(1)
from lineitem
where l_quantity=1.0;

I read the document of Parquet, it said Parquet have the similar Min and
Max statistics like ORC to filter unrelated data.

But I notice that the records number showed by Counter RECORDS_IN is the
same with the whole table.

That is, the index in Parquet does not work.

What are the reasons?

Thanks!

Mime
View raw message