hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1159) Reducers hang when map output file has a checksum error
Date Mon, 26 Mar 2007 18:16:32 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12484176
] 

Devaraj Das commented on HADOOP-1159:
-------------------------------------

Although this issue can be solved by handling the NPE, this and other such problems, in general,
can be handled by a fix for HADOOP-1158.

> Reducers hang when map output file has a checksum error
> -------------------------------------------------------
>
>                 Key: HADOOP-1159
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1159
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.2
>            Reporter: Nigel Daley
>
> Two reduces hung in our sort benchmark. They always fail to get map outputs from node
X due to checksum error when the map outputs are read at that node resulting in a NullPointerException
on node X. This leads to constant failures on the two fetching reduces.
> 2007-03-26 00:02:57,082 WARN org.apache.hadoop.fs.FileSystem: Moving bad file /e/c/k/hqa/tb/tmp/mapred/local2/task_0002_m_022488_0/file.out
to /e/c/bad_files/file.out.542279301
> 2007-03-26 00:02:57,083 INFO org.apache.hadoop.fs.FSInputChecker: Found checksum error:
org.apache.hadoop.fs.ChecksumException: Checksum error: /e/c/k/hqa/tb/tmp/mapred/local2/task_0002_m_022488_0/file.out
at 106484224
> 	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.verifySum(ChecksumFileSystem.java:254)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(ChecksumFileSystem.java:211)
> 	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(ChecksumFileSystem.java:167)
> 	at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:41)
> 	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> 	at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> 	at java.io.DataInputStream.read(DataInputStream.java:132)
> 	at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:1659)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
> 	at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
> 	at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
> 	at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
> 	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
> 	at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
> 	at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
> 	at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
> 	at org.mortbay.http.HttpServer.service(HttpServer.java:954)
> 	at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
> 	at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
> 	at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
> 	at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
> 	at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
> 	at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> 2007-03-26 00:02:57,083 WARN /: /mapOutput?map=task_0002_m_022488_0&reduce=1542:

> java.lang.NullPointerException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message