Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 254C37596 for ; Wed, 3 Aug 2011 14:04:53 +0000 (UTC) Received: (qmail 37634 invoked by uid 500); 3 Aug 2011 14:04:52 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 37604 invoked by uid 500); 3 Aug 2011 14:04:52 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 37594 invoked by uid 99); 3 Aug 2011 14:04:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Aug 2011 14:04:52 +0000 X-ASF-Spam-Status: No, hits=-2000.7 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Aug 2011 14:04:49 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 9BC19A5F17 for ; Wed, 3 Aug 2011 14:04:27 +0000 (UTC) Date: Wed, 3 Aug 2011 14:04:27 +0000 (UTC) From: "Steve Loughran (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1474217524.4719.1312380267633.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <465191020.292.1310999518737.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HADOOP-7469) add a standard handler for socket connection problems which improves diagnostics MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078756#comment-13078756 ] Steve Loughran commented on HADOOP-7469: ---------------------------------------- except the patch is now out of date with SVN and my local trunk has tree conflict. I'll fix that then resubmit > add a standard handler for socket connection problems which improves diagnostics > -------------------------------------------------------------------------------- > > Key: HADOOP-7469 > URL: https://issues.apache.org/jira/browse/HADOOP-7469 > Project: Hadoop Common > Issue Type: Sub-task > Components: util > Affects Versions: 0.20.203.0, 0.23.0 > Reporter: Steve Loughran > Priority: Minor > Labels: debugging > Attachments: HADOOP-7466-connection-handler.patch, HADOOP-7469.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > connection refused, connection timed out, no route to host, etc, are classic IOExceptions that can be raised in a lot of parts of the code. The standard JDK exceptions are useless for debugging as they > # don't include the destination (host, port) that can be used in diagnosing service dead/blocked problems > # don't include any source hostname that can be used to handle routing issues > # assume the reader understands the TCP stack. > It's obvious from the -user lists that a lot of people hit these problems and don't know how to fix them. Sometimes the source has been patched to insert the diagnostics, but it may be convenient to have a single method to translate some > {code} > SocketException processIOException(SocketException e, String destHost, int destPort) { > String localhost = getLocalHostname(); > String details = "From "+ localhost +" to "+ desthost + ":"+destPort; > if (e instanceof ConnectException) { > return new ConnectException(details > + " -- see http://wiki.apache.org/hadoop/ConnectionRefused --" + e, e); > } > if (e instanceof UnknownHostException) { > return new UnknownHostException(details > + " -- see http://wiki.apache.org/hadoop/UnknownHost --" + e, e); > } > // + handlers for other common socket exceptions > > //and a default that returns an unknown class unchanged > return e; > } > > {code} > Testing: try to connect to an unknown host, a local port that isn't live, etc. It's hard to replicate all failures consistently. It may be simpler just to verify that if you pass in a specific exception, the string is expanded and the class is unchanged. > This code could then be patched in to places where IO takes place. Note that Http Components and HttpClient libs already add some destination details on some operation failures, with their own HttpException tree: it's simplest to leave these alone. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira