hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Boesch <>
Subject Select distinct on partitioned column requires reading all the files?
Date Tue, 24 Feb 2015 06:26:02 GMT
When querying a hive table according to a partitioning column, it would be
logical that a simple

select count(distinct partitioned_column_name) from my_partitioned_table

would complete almost instantaneously.

But we are seeing that both hive and impala are unable to execute this
query properly: they just read the entire table!

What do we need to do to ensure the above command executes rapidly?

View raw message