hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jorgen Johnson (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-1685) Possible getMapOutput() failures on tasktracker when mapred.reduce.tasks is overriden in job
Date Mon, 06 Aug 2007 18:23:59 GMT
Possible getMapOutput() failures on tasktracker when mapred.reduce.tasks is overriden in job
--------------------------------------------------------------------------------------------

                 Key: HADOOP-1685
                 URL: https://issues.apache.org/jira/browse/HADOOP-1685
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.13.1
         Environment: 6 node cluster, all running on Redhat Enterprise 3.0 Standard Server
(Update 4), running on java6, 2 nodes are xen virts
            Reporter: Jorgen Johnson
            Priority: Minor


The following error occurs many times on a job where I have defined the number of reduce tasks
to be less than the default number of reduce tasks defined in my hadoop-site.xml.    Working
off my novice understanding of hadoop infrastructure at this point, it appears that the jobTracker
is not honoring the mapred.reduce.tasks as defined in the job-conf, and instead is using the
default.

Map output lost, rescheduling: getMapOutput(task_0010_m_000002_0,6) failed :
java.io.EOFException
	at java.io.DataInputStream.readFully(DataInputStream.java:180)
	at java.io.DataInputStream.readLong(DataInputStream.java:399)
	at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:1911)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:747)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:860)
	at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
	at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
	at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
	at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
	at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
	at org.mortbay.http.HttpServer.service(HttpServer.java:954)
	at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
	at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
	at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
	at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
	at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
	at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)

ie.  
hadoop-site.xml defines mapred.reduce.tasks=7,
In my job I define mapred.reduce.tasks=3

I get many errors looking for:
getMapOutput(task_0010_m_000002_0,3)
getMapOutput(task_0010_m_000002_0,4)
getMapOutput(task_0010_m_000002_0,5)
getMapOutput(task_0010_m_000002_0,6)

This additional error appears to be a side-effect of the actual problem (it stopped happening
when I change the job-conf to match default number of reduce tasks):
task_0010_m_000016_0: log4j:ERROR Failed to close the task's log with the exception: java.io.IOException:
Bad file descriptor
task_0010_m_000016_0:   at java.io.FileOutputStream.writeBytes(Native Method)
task_0010_m_000016_0:   at java.io.FileOutputStream.write(FileOutputStream.java:260)
task_0010_m_000016_0:   at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
task_0010_m_000016_0:   at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
task_0010_m_000016_0:   at org.apache.hadoop.mapred.TaskLog$Writer.writeIndexRecord(TaskLog.java:251)
task_0010_m_000016_0:   at org.apache.hadoop.mapred.TaskLog$Writer.close(TaskLog.java:235)
task_0010_m_000016_0:   at org.apache.hadoop.mapred.TaskLogAppender.close(TaskLogAppender.java:67)
task_0010_m_000016_0:   at org.apache.log4j.AppenderSkeleton.finalize(AppenderSkeleton.java:124)
task_0010_m_000016_0:   at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
task_0010_m_000016_0:   at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
task_0010_m_000016_0:   at java.lang.ref.Finalizer.access$100(Finalizer.java:14)
task_0010_m_000016_0:   at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message