hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Hive Partitioning
Date Wed, 15 Dec 2010 22:01:35 GMT
On Wed, Dec 15, 2010 at 4:52 PM, Mark <static.void.dev@gmail.com> wrote:
> Can someone explain what partitioning is and why it would be used.. example?
> Thanks
>

A partition is a physical and logical partition of the data. The query
planner can use partitions in the WHERE clause to prune data that hive
does not need to process.

For example, if you partition your table by day, you can write queries
such as SELECT count(1) FROM table where day=20100101. Hive will only
use the single partition as input, rather then the entire table.

Generally, you do not want to have to many partitions small partitions
or too few.

http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Add_Partitions

Mime
View raw message