Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 46902 invoked from network); 13 Mar 2009 13:08:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Mar 2009 13:08:21 -0000 Received: (qmail 44123 invoked by uid 500); 13 Mar 2009 13:08:18 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 44088 invoked by uid 500); 13 Mar 2009 13:08:18 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 44077 invoked by uid 99); 13 Mar 2009 13:08:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Mar 2009 06:08:18 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Mar 2009 13:08:10 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 767D7234C041 for ; Fri, 13 Mar 2009 06:07:50 -0700 (PDT) Message-ID: <197648071.1236949670473.JavaMail.jira@brutus> Date: Fri, 13 Mar 2009 06:07:50 -0700 (PDT) From: "Hemanth Yamijala (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-5487) Few tasks failed while creating the work directory for a job, when job tracker was restarted In-Reply-To: <1989475715.1236949550634.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681696#action_12681696 ] Hemanth Yamijala commented on HADOOP-5487: ------------------------------------------ Following was the exception trace on such a task: java.io.IOException: Mkdirs failed to create /path/to/mapred-local/taskTracker/jobcache/job_200903130908_0051/work at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:829) at org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1743) at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97) at org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1708) > Few tasks failed while creating the work directory for a job, when job tracker was restarted > -------------------------------------------------------------------------------------------- > > Key: HADOOP-5487 > URL: https://issues.apache.org/jira/browse/HADOOP-5487 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Reporter: Hemanth Yamijala > > A randomwriter job was running when the job tracker restarted. After the jobtracker restarted, some tasktrackers were sent a reinit action. After this, some new tasks of the random writer were scheduled to be run on the same task trackers. These failed in the job localization while creating the work directory. However, the next attempts of the same job ran successfully and the job succeeded. This happened in about 1% of the total number of tasks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.