hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Krishnappa <suresh.krishna...@gmail.com>
Subject HIVE issues when using large number of partitions
Date Thu, 07 Mar 2013 14:31:19 GMT
Hi All,
I have a hadoop cluster with data present in large number of directories (
> 10,000)
To run HIVE queries over this data I created an external partitioned table
and pointed each directory as a partition to the external table using
'alter table add partition' command.
Is there a better way to create a HIVE external table over large number of
directories?

Also I am facing the following issues due to the large number of partitions
1) The DDL operations of creating the table and adding partitions to the
table takes a very long time. Takes about an hour to add around 10,000
partitions
2) Getting 'out of memory' java exception while adding partitions > 50000
3) Sometimes getting 'out of memory' java exception for select queries for
partitions > 10000

What is the recommended limit to the number of partitions that we can
create with an HIVE table?
Are there any configuration settings in hive/hadoop to support large number
of partitions?

I am using HIVE 0.10.0. I re-ran the tests by replacing derby with
postgresql as metastore and still faced similar issues.

Would appreciate any inputs on this

Thanks
Suresh

Mime
View raw message