hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: hive runs slowly
Date Fri, 21 Oct 2011 13:59:55 GMT
On Fri, Oct 21, 2011 at 9:22 AM, john smith <js1987.smith@gmail.com> wrote:

> Hi list,
>
> I am also facing the same problem. My reducers hang at this position and it
> takes hours to complete a single reduce task. Can any hive guru help us out
> with this issue.
>
> Thanks,
> jS
>
> 2011/10/21 bangbig <lizhonglianggood@163.com>
>
>> HI all,
>>
>> HIVE runs too slowly when it is doing such things(see the log below), what's the
problem? because I'm joining two large table?
>>
>> it runs pretty fast at first. when the job finishes 95%, it begins to slow down.
>>
>> --------------------------------------------------
>>
>> INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1044000000 rows
>> 2011-10-21 16:55:57,427 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1045000000 rows
>> 2011-10-21 16:55:57,545 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1046000000 rows
>> 2011-10-21 16:55:57,686 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1047000000 rows
>> 2011-10-21 16:55:57,806 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1048000000 rows
>> 2011-10-21 16:55:57,926 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1049000000 rows
>> 2011-10-21 16:55:58,045 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1050000000 rows
>> 2011-10-21 16:55:58,164 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1051000000 rows
>> 2011-10-21 16:55:58,284 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1052000000 rows
>> 2011-10-21 16:55:58,405 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1053000000 rows
>> 2011-10-21 16:55:58,525 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1054000000 rows
>> 2011-10-21 16:55:58,644 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1055000000 rows
>> 2011-10-21 16:55:58,764 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1056000000 rows
>> 2011-10-21 16:55:58,883 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1057000000 rows
>> 2011-10-21 16:55:59,003 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1058000000 rows
>> 2011-10-21 16:55:59,122 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1059000000 rows
>> 2011-10-21 16:55:59,242 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1060000000 rows
>> 2011-10-21 16:55:59,361 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1061000000 rows
>> 2011-10-21 16:55:59,482 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1062000000 rows
>> 2011-10-21 16:55:59,601 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding
1063000000 rows
>>
>>
>>
>>
>
It is hard to say without seeing the query, the table definition, and the
explain. Please send the query. Although I have a theory:

This query is not good:
select a,b from a,b where a.id=b.id
It does a Cart join.

This query is better.
select a,b from a inner join b on (a.id=b.id)

Consider setting in your hive-site.xml

hive.mapred.mode=strict

It can prevent you from running dangerous queries.

Mime
View raw message