Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D280969D for ; Fri, 21 Oct 2011 14:00:23 +0000 (UTC) Received: (qmail 73152 invoked by uid 500); 21 Oct 2011 14:00:21 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 73122 invoked by uid 500); 21 Oct 2011 14:00:21 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 73114 invoked by uid 99); 21 Oct 2011 14:00:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Oct 2011 14:00:21 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com designates 209.85.160.176 as permitted sender) Received: from [209.85.160.176] (HELO mail-gy0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Oct 2011 14:00:16 +0000 Received: by gyh3 with SMTP id 3so5264177gyh.35 for ; Fri, 21 Oct 2011 06:59:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=vI08khbTaTVc3aZC2rGUUdAclPz0R81vyshbpJwhLpc=; b=OFyKlPTsHfUg9HhNBM3MWxN0WsAByyn1vurjov29n2KDnTOavmE06lRtocLVHFKFQI 6F8OFzl0EJW/Jp8/f5SPtZwGnOMQMb1ZfvQ+da5kErlVKoD8QDDkIr37vExhPah6Z2L5 VxbVvryGVzEYnpVVTbrFmERGV/FmxiA4k8l5k= MIME-Version: 1.0 Received: by 10.43.65.79 with SMTP id xl15mr1905300icb.6.1319205595210; Fri, 21 Oct 2011 06:59:55 -0700 (PDT) Received: by 10.42.2.79 with HTTP; Fri, 21 Oct 2011 06:59:55 -0700 (PDT) In-Reply-To: References: <51e2e0f4.9d5e.13325bf5868.Coremail.lizhonglianggood@163.com> Date: Fri, 21 Oct 2011 09:59:55 -0400 Message-ID: Subject: Re: hive runs slowly From: Edward Capriolo To: user@hive.apache.org Content-Type: multipart/alternative; boundary=bcaec51b1c8fe3630704afcf7cb2 --bcaec51b1c8fe3630704afcf7cb2 Content-Type: text/plain; charset=ISO-8859-1 On Fri, Oct 21, 2011 at 9:22 AM, john smith wrote: > Hi list, > > I am also facing the same problem. My reducers hang at this position and it > takes hours to complete a single reduce task. Can any hive guru help us out > with this issue. > > Thanks, > jS > > 2011/10/21 bangbig > >> HI all, >> >> HIVE runs too slowly when it is doing such things(see the log below), what's the problem? because I'm joining two large table? >> >> it runs pretty fast at first. when the job finishes 95%, it begins to slow down. >> >> -------------------------------------------------- >> >> INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1044000000 rows >> 2011-10-21 16:55:57,427 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1045000000 rows >> 2011-10-21 16:55:57,545 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1046000000 rows >> 2011-10-21 16:55:57,686 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1047000000 rows >> 2011-10-21 16:55:57,806 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1048000000 rows >> 2011-10-21 16:55:57,926 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1049000000 rows >> 2011-10-21 16:55:58,045 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1050000000 rows >> 2011-10-21 16:55:58,164 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1051000000 rows >> 2011-10-21 16:55:58,284 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1052000000 rows >> 2011-10-21 16:55:58,405 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1053000000 rows >> 2011-10-21 16:55:58,525 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1054000000 rows >> 2011-10-21 16:55:58,644 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1055000000 rows >> 2011-10-21 16:55:58,764 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1056000000 rows >> 2011-10-21 16:55:58,883 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1057000000 rows >> 2011-10-21 16:55:59,003 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1058000000 rows >> 2011-10-21 16:55:59,122 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1059000000 rows >> 2011-10-21 16:55:59,242 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1060000000 rows >> 2011-10-21 16:55:59,361 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1061000000 rows >> 2011-10-21 16:55:59,482 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1062000000 rows >> 2011-10-21 16:55:59,601 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1063000000 rows >> >> >> >> > It is hard to say without seeing the query, the table definition, and the explain. Please send the query. Although I have a theory: This query is not good: select a,b from a,b where a.id=b.id It does a Cart join. This query is better. select a,b from a inner join b on (a.id=b.id) Consider setting in your hive-site.xml hive.mapred.mode=strict It can prevent you from running dangerous queries. --bcaec51b1c8fe3630704afcf7cb2 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Fri, Oct 21, 2011 at 9:22 AM, john sm= ith <js1987.= smith@gmail.com> wrote:
Hi list,

I am also facing the same problem. My reducers = hang at this position and it takes hours to complete a single reduce task. = Can any hive guru help us out with this issue.

Thanks,
jS

2011/10/21 bangbig <= span dir=3D"ltr"><lizhonglianggood@163.com>
HI all,
HIVE runs too slowly when it is doing such things(see the=
 log below), what's the problem? because I'm joining two large tabl=
e?
it runs pretty fast at first. when the job finishes 95%, it be=
gins to slow down.
--------------------------------------------------
INFO org.=
apache.hadoop.hive.ql.exec.JoinOperator: 4 forwarding 1044000000 rows
2011-10-21 16:55:57,427 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1045000000 rows
2011-10-21 16:55:57,545 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1046000000 rows
2011-10-21 16:55:57,686 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1047000000 rows
2011-10-21 16:55:57,806 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1048000000 rows
2011-10-21 16:55:57,926 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1049000000 rows
2011-10-21 16:55:58,045 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1050000000 rows
2011-10-21 16:55:58,164 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1051000000 rows
2011-10-21 16:55:58,284 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1052000000 rows
2011-10-21 16:55:58,405 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1053000000 rows
2011-10-21 16:55:58,525 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1054000000 rows
2011-10-21 16:55:58,644 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1055000000 rows
2011-10-21 16:55:58,764 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1056000000 rows
2011-10-21 16:55:58,883 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1057000000 rows
2011-10-21 16:55:59,003 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1058000000 rows
2011-10-21 16:55:59,122 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1059000000 rows
2011-10-21 16:55:59,242 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1060000000 rows
2011-10-21 16:55:59,361 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1061000000 rows
2011-10-21 16:55:59,482 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1062000000 rows
2011-10-21 16:55:59,601 INFO org.apache.hadoop.hive.ql.exec.JoinOperator: 4=
 forwarding 1063000000 rows




It is hard to say without seeing the= query, the table definition, and the explain. Please send the query. Altho= ugh I have a theory:

This query is not good:
select a,b from a,b where a.id=3Db.id=A0
It does a Cart join.=A0
This query is better.
select a,b from a inner join b= on (a.id=3Db.id)= =A0

Consider setting in your hive-site.xml

hive.mapred.mode=3Dstrict

It can prevent = you from running dangerous queries.

--bcaec51b1c8fe3630704afcf7cb2--