Return-Path: Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: (qmail 12167 invoked from network); 27 Aug 2010 22:28:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 27 Aug 2010 22:28:17 -0000 Received: (qmail 57817 invoked by uid 500); 27 Aug 2010 22:28:17 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 57714 invoked by uid 500); 27 Aug 2010 22:28:16 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-dev@hadoop.apache.org Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 57705 invoked by uid 99); 27 Aug 2010 22:28:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Aug 2010 22:28:16 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Aug 2010 22:28:16 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o7RMRtTb008294 for ; Fri, 27 Aug 2010 22:27:55 GMT Message-ID: <21662848.43251282948075442.JavaMail.jira@thor> Date: Fri, 27 Aug 2010 18:27:55 -0400 (EDT) From: "Joydeep Sen Sarma (JIRA)" To: mapreduce-dev@hadoop.apache.org Subject: [jira] Reopened: (MAPREDUCE-115) Map tasks are receiving FileNotFound Exceptions for spill files on a regular basis and are getting killed MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joydeep Sen Sarma reopened MAPREDUCE-115: ----------------------------------------- re-opening. we are seeing this a lot on hadoop-20 (yahoo distribution): 1. reducers not able to fetch map outputs because map side tasktracker cannot locate map output 2. mappers not able to locate previously spilled data Scott has added logging that is telling us that: - for #1. that the map output file was actually present/created at the time the map was first reported to be done - that we have not removed the mapoutput file (from the TT code path deleting the files) before the reducer fetch request came in so something very fishy - seems like either the files disappear in the interim - or that the localdirallocator is not being able to find things that are actually present. > Map tasks are receiving FileNotFound Exceptions for spill files on a regular basis and are getting killed > --------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-115 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-115 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Jothi Padmanabhan > > The following is the log -- Map tasks are unable to locate the spill files when they are doing the final merge (mergeParts). > java.io.FileNotFoundException: File /xxx/mapred-tt/mapred-local/taskTracker/jobcache/job_200808190959_0001/attempt_200808190959_0001_m_000000_0/output/spill23.out does not exist. > at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420) > at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:244) > at org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:682) > at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.getFileLength(ChecksumFileSystem.java:218) > at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.seek(ChecksumFileSystem.java:259) > at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:37) > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1102) > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:769) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:255) > at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2208) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.