hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aggarwal, Vaibhav" <vagg...@amazon.com>
Subject RE: CDH3 U1 Hive Job-commit very slow
Date Wed, 10 Aug 2011 17:30:26 GMT
How much time is the query startup taking?

In earlier versions of Hive (before HIVE 2299) the query startup process had an algorithm
which took O(n^2) operations in number of partitions.
This means 100M operations before it would submit the map reduce job.

From: air [mailto:cnweike@gmail.com]
Sent: Wednesday, August 10, 2011 3:40 AM
To: user@hive.apache.org
Subject: Re: CDH3 U1 Hive Job-commit very slow

there is only 10186 partitions in the metadata store (select count(1) from PARTITIONS; in
mysql), I think it is not the problem.
2011/8/10 Aggarwal, Vaibhav <vaggarw@amazon.com<mailto:vaggarw@amazon.com>>
Do you have a lot of partitions in your table?
Time taken to process the partitions before submitting the job is proportional to number of
partitions.

There is a patch I submitted recently as an attempt to alleviate this problem:

https://issues.apache.org/jira/browse/HIVE-2299

If that is not the case, even I would be interested in root cause of large query startup time.

From: air [mailto:cnweike@gmail.com<mailto:cnweike@gmail.com>]
Sent: Tuesday, August 09, 2011 1:19 AM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Fwd: CDH3 U1 Hive Job-commit very slow


---------- Forwarded message ----------
From: air <cnweike@gmail.com<mailto:cnweike@gmail.com>>
Date: 2011/8/9
Subject: CDH3 U1 Hive Job-commit very slow
To: CDH Users <cdh-user@cloudera.org<mailto:cdh-user@cloudera.org>>


when I submit a ql to hive, it is a very long time until it really submit the job to the hadoop
cluster, what may cause this problem ? thank you for your help.

hive> select count(1) from log_test where src='test' and ds='2011-08-04';
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>              <------------------------stay here
for a long time..

--
Knowledge Mangement .



--
Knowledge Mangement .



--
Knowledge Mangement .
Mime
View raw message