hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2893) checksum exceptions on trunk
Date Tue, 26 Feb 2008 04:59:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572357#action_12572357
] 

Raghu Angadi commented on HADOOP-2893:
--------------------------------------

I doubt if these are real disk problems. Same disks are used for DFS files also, we should
see multiple blocks deleted due to these errors in NameNode log if that is the case.


> checksum exceptions on trunk
> ----------------------------
>
>                 Key: HADOOP-2893
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2893
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.17.0
>            Reporter: lohit vijayarenu
>
> While running jobs like Sort/WordCount on trunk I see few task failures with ChecksumException
> Re-running the tasks on different nodes succeeds. 
> Here is the stack
> {noformat}
> Map output lost, rescheduling: getMapOutput(task_200802251721_0004_m_000237_0,29) failed
:
> org.apache.hadoop.fs.ChecksumException: Checksum error: /tmps/4/gs203240-29657-6751459769688273/mapred-tt/mapred-local/task_200802251721_0004_m_000237_0/file.out
at 2085376
>   at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:276)
>   at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
>   at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
>   at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
>   at java.io.DataInputStream.read(DataInputStream.java:132)
>   at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2299)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
>   at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>   at org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
>   at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
>   at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>   at org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
>   at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>   at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>   at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
>   at org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>   at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
>   at org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
>   at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
>   at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
> {noformat}
> another stack
> {noformat}
> Caused by: org.apache.hadoop.fs.ChecksumException: Checksum error: /tmps/4/gs203240-29657-6751459769688273/mapred-tt/mapred-local/task_200802251721_0004_r_000110_0/map_367.out
at 21884416
>   at org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:276)
>   at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
>   at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
>   at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
>   at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
>   at java.io.DataInputStream.readFully(DataInputStream.java:178)
>   at org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:56)
>   at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
>   at org.apache.hadoop.io.SequenceFile$Reader.nextRawKey(SequenceFile.java:1930)
>   at org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2958)
>   at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.next(SequenceFile.java:2716)
>   at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:209)
>   at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:177)
>   ... 5 more
> {noformat}
> both with local files

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message