Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 89661 invoked from network); 10 Jan 2007 00:38:49 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Jan 2007 00:38:49 -0000 Received: (qmail 25604 invoked by uid 500); 10 Jan 2007 00:38:55 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 25565 invoked by uid 500); 10 Jan 2007 00:38:55 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 25549 invoked by uid 99); 10 Jan 2007 00:38:55 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jan 2007 16:38:55 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jan 2007 16:38:47 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 8BCDA714295 for ; Tue, 9 Jan 2007 16:38:27 -0800 (PST) Message-ID: <1674845.1168389507541.JavaMail.jira@brutus> Date: Tue, 9 Jan 2007 16:38:27 -0800 (PST) From: "Raghu Angadi (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-758) FileNotFound on DFS block file In-Reply-To: <2482003.1164778941128.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12463436 ] Raghu Angadi commented on HADOOP-758: ------------------------------------- The exception in the bug is that last exception that that occurred. It masks the first exception that would be a better indicator of the problem. ReduceTask.java (around line 313)Looks like try { /* run reducer */ } finally { /* close some streams */ } The above trace and the one in HADOOP-757 both are in finally {} and mask the exception in try {}. I will submit a patch that prints the exception thrown in try {} if finally block throws one. While trying reproduce the above trace I managed to produce "Bad File Descriptor" exception in HADOOP-757. In summary, it looks like these failures are possible with low tmp spaces but we don't log the exceptions that were triggered initially. > FileNotFound on DFS block file > ------------------------------ > > Key: HADOOP-758 > URL: https://issues.apache.org/jira/browse/HADOOP-758 > Project: Hadoop > Issue Type: Bug > Components: dfs > Affects Versions: 0.8.0 > Reporter: Owen O'Malley > Assigned To: Raghu Angadi > > While run the sort benchmark a reduce failed with: > java.io.FileNotFoundException: /tmp/hadoop-oom/dfs/tmp/tmp/client-4362164194084664090 (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:106) > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.endBlock(DFSClient.java:1156) > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:1244) > at java.io.FilterOutputStream.close(FilterOutputStream.java:143) > at java.io.FilterOutputStream.close(FilterOutputStream.java:143) > at java.io.FilterOutputStream.close(FilterOutputStream.java:143) > at org.apache.hadoop.fs.FSDataOutputStream$Summer.close(FSDataOutputStream.java:98) > at java.io.FilterOutputStream.close(FilterOutputStream.java:143) > at java.io.FilterOutputStream.close(FilterOutputStream.java:143) > at java.io.FilterOutputStream.close(FilterOutputStream.java:143) > at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:515) > at org.apache.hadoop.mapred.SequenceFileOutputFormat$1.close(SequenceFileOutputFormat.java:71) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:310) > at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1271) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira