hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brad Ruderman <>
Subject Re: External Partition Table
Date Thu, 31 Oct 2013 22:38:21 GMT
Wow that question won't be answerable. It all depends on the amount of data
per partition and the queries you are going to be executing on it, as well
as the structure of the data. In general in hive (depending on your cluster
size) you need to balance the number of files with the size, smaller number
of files is typically preferred but partitions will help when date


On Thu, Oct 31, 2013 at 3:34 PM, Raj Hadoop <> wrote:

> Hi,
> I am planning for a Hive External Partition Table based on a date.
> Which one of the below yields a better performance or both have the same
> performance?
> 1) Partition based on one folder per day
> LIKE date INT
> 2) Partition based on one folder per year / month / day ( So it has three
> folders)
> LIKE year INT, month INT, day INT
> Thanks,
> Raj

View raw message