hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <kira.w...@xiaoi.com>
Subject 答复: hive task fails when left semi join
Date Tue, 16 Jul 2013 07:49:01 GMT

I have check it. As datanode logs shown that,

2013-07-16 00:05:31,294 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(attempt_201307041810_0138_m_000259_0,53) failed :

org.mortbay.jetty.EofException: timeout


This may be caused by a so-called “data skew” problem.


Thanks, Devaraj k.



发件人: Devaraj k [mailto:devaraj.k@huawei.com] 
发送时间: 2013年7月16日 15:37
收件人: user@hadoop.apache.org
主题: RE: hive task fails when left semi join




   In the given image, I see there are some failed/killed map& reduce task
attempts. Could you check why these are failing, you can check further based
on the fail/kill reason.




Devaraj k


From: kira.wang@xiaoi.com [mailto:kira.wang@xiaoi.com] 
Sent: 16 July 2013 12:57
To: user@hadoop.apache.org
Subject: hive task fails when left semi join




I am trying to filter out some records in a table in hive.

The number of lines in this table is 4billions+, 

I make a left semi join between above table and a small table with 1k lines.


However, after 3 hours job running, it turns out a fail status.


My question are as follows,

1.     How could I address this problem and final solve it?

2.     Is there any other good methods could filter out records with give


The following picture is a snapshot of the failed job.


View raw message