Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EAF4D4B17 for ; Fri, 13 May 2011 16:35:27 +0000 (UTC) Received: (qmail 24439 invoked by uid 500); 13 May 2011 16:35:27 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 24406 invoked by uid 500); 13 May 2011 16:35:27 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 24398 invoked by uid 99); 13 May 2011 16:35:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 May 2011 16:35:27 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 May 2011 16:35:26 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C6EFB8928F for ; Fri, 13 May 2011 16:34:47 +0000 (UTC) Date: Fri, 13 May 2011 16:34:47 +0000 (UTC) From: "Jeffrey Naisbitt (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <2431619.10526.1305304487811.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1041089887.7868.1305228827353.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-2489) Jobsplits with random hostnames can make the queue unusable MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033116#comment-13033116 ] Jeffrey Naisbitt commented on MAPREDUCE-2489: --------------------------------------------- Honestly, I'm not sure what caching was enabled at the time. How would caching have helped in this case though - where we have basically tons of lookups on garbage hostnames? (none of these strings are repeated) > Jobsplits with random hostnames can make the queue unusable > ----------------------------------------------------------- > > Key: MAPREDUCE-2489 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2489 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Reporter: Jeffrey Naisbitt > Assignee: Jeffrey Naisbitt > > We saw an issue where a custom InputSplit was returning invalid hostnames for the splits that were then causing the JobTracker to attempt to excessively resolve host names. This caused a major slowdown for the JobTracker. We should prevent invalid InputSplit hostnames from affecting everyone else. > I propose we implement some verification for the hostnames to try to ensure that we only do DNS lookups on valid hostnames (and fail otherwise). We could also fail the job after a certain number of failures in the resolve. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira