Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 95702 invoked from network); 27 Feb 2007 07:27:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Feb 2007 07:27:28 -0000 Received: (qmail 97805 invoked by uid 500); 27 Feb 2007 07:27:35 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 97785 invoked by uid 500); 27 Feb 2007 07:27:35 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 97776 invoked by uid 99); 27 Feb 2007 07:27:35 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Feb 2007 23:27:35 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 26 Feb 2007 23:27:26 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B8C4D714048 for ; Mon, 26 Feb 2007 23:27:05 -0800 (PST) Message-ID: <20129876.1172561225754.JavaMail.jira@brutus> Date: Mon, 26 Feb 2007 23:27:05 -0800 (PST) From: "Devaraj Das (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Updated: (HADOOP-1042) Improve the handling of failed map output fetches In-Reply-To: <18839909.1172554506242.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HADOOP-1042: -------------------------------- Attachment: 1042.patch This patch does the following (everything in the file ReduceTaskRunner.java): 1) Changes the datastructure of knownOutputs from List to Map. This eases replacing MapOutputLocation objects for the failed fetches (if the JobTracker later on gives us new locations for those mapIds) 2) Changes ListIterator to Iterator (since it is not straightforward to get a ListIterator out of a Map and we anyway don't use the features of a ListIterator) 3) Changes the order in which entries (mapId/MapOutputLocation objects) are added in the knownOutputs Map - first entries corresponding to failed fetches are added and then the new entries (got from JobTracker) are added. This will ensure that the new entries overwrite the old (failed) entries (for the same mapId hashkeys). 4) Removes the call to Collections.shuffle( ) and the associated Random object. Since the randomness for fetching map outputs is not there anymore, we don't need this. 5) queryJobTracker now returns a List instead of an array of MapOutputLocation. > Improve the handling of failed map output fetches > ------------------------------------------------- > > Key: HADOOP-1042 > URL: https://issues.apache.org/jira/browse/HADOOP-1042 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Affects Versions: 0.11.2 > Reporter: Devaraj Das > Assigned To: Devaraj Das > Attachments: 1042.patch > > > Currently, whenever fetch of a map output fails the corresponding MapOutputLocation is added to a List datastructure for later retrial. But, if the failure was due to a lost task, the entry that was added is not deleted. For such cases, unnecessary retrials will happen. This situation should be prevented. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.