hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <kira.w...@xiaoi.com>
Subject 答复: hive task fails when left semi join
Date Tue, 16 Jul 2013 08:33:33 GMT
 

Nitin,

 

Thanks for your carefully replay.

 

The hive version used currently is 0.10.0, I find the configuration item you
have said.



 

I am using the map join method to filter out the data, it works quite well.



 

About the errors without using the map join method:

 

[one of DNs]

 

2013-07-16 00:05:31,294 WARN org.apache.hadoop.mapred.TaskTracker:
getMapOutput(attempt_201307041810_0138_m_000259_0,53) failed :

org.mortbay.jetty.EofException: timeout

         at
org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.
java:548)

         at
org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:572)

         at
org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)

         at
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:651)

         at
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580)

         at
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java
:3916)

         at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)

         at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

         at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)

         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler
.java:1221)

         at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.jav
a:835)

         at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler
.java:1212)

         at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)

         at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)

         at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)

         at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)

         at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)

         at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerColl
ection.java:230)

         at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

         at org.mortbay.jetty.Server.handle(Server.java:326)

         at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)

         at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnectio
n.java:928)

         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)

         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)

         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)

         at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)

         at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582
)

 

[NN]

 

2013-07-16 00:07:31,145 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201307041810_0138_r_000053_1: Task
attempt_201307041810_0138_r_000053_1 failed to report status for 601
seconds. Killing!

 

 

 

发件人: Nitin Pawar [mailto:nitinpawar432@gmail.com] 
发送时间: 2013年7月16日 15:52
收件人: user@hadoop.apache.org
主题: Re: hive task fails when left semi join

 

Dev, 

 

from what I learned in my past exp with running huge one table queries is
one hits reduce side memory limits or timeout limits. I will wait for Kira
to give more details on the same.

sorry i forgot to ask for the logs and suggested a different approach :( 

 

Kira, 

Page is in chinese so can't make much out of it but the query looks like map
join. 

If you are using older hive version 

then the query showed on the mail thread looks good 

 

if you are using new hive version then 

 hive.auto.convert.join=true will do the job 

 

On Tue, Jul 16, 2013 at 1:07 PM, Devaraj k <devaraj.k@huawei.com> wrote:

Hi,

   In the given image, I see there are some failed/killed map& reduce task
attempts. Could you check why these are failing, you can check further based
on the fail/kill reason.

 

 

Thanks

Devaraj k

 

From: kira.wang@xiaoi.com [mailto:kira.wang@xiaoi.com] 
Sent: 16 July 2013 12:57
To: user@hadoop.apache.org
Subject: hive task fails when left semi join

 

Hello,

 

I am trying to filter out some records in a table in hive.

The number of lines in this table is 4billions+, 

I make a left semi join between above table and a small table with 1k lines.

 

However, after 3 hours job running, it turns out a fail status.

 

My question are as follows,

1.     How could I address this problem and final solve it?

2.     Is there any other good methods could filter out records with give
conditions?

 

The following picture is a snapshot of the failed job.



 





 

-- 
Nitin Pawar


Mime
View raw message