Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2B2F896A5 for ; Tue, 20 Mar 2012 21:24:03 +0000 (UTC) Received: (qmail 36167 invoked by uid 500); 20 Mar 2012 21:24:02 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 36115 invoked by uid 500); 20 Mar 2012 21:24:02 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 36100 invoked by uid 99); 20 Mar 2012 21:24:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 21:24:02 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 21:24:01 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B9FF7BDA80 for ; Tue, 20 Mar 2012 21:23:41 +0000 (UTC) Date: Tue, 20 Mar 2012 21:23:41 +0000 (UTC) From: "Hudson (Commented) (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <765421580.38552.1332278621763.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HADOOP-8191) SshFenceByTcpPort uses netcat incorrectly MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233788#comment-13233788 ] Hudson commented on HADOOP-8191: -------------------------------- Integrated in Hadoop-Common-trunk-Commit #1907 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1907/]) HADOOP-8191. SshFenceByTcpPort uses netcat incorrectly. Contributed by Todd Lipcon. (Revision 1303148) Result = SUCCESS todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1303148 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/SshFenceByTcpPort.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestSshFenceByTcpPort.java > SshFenceByTcpPort uses netcat incorrectly > ----------------------------------------- > > Key: HADOOP-8191 > URL: https://issues.apache.org/jira/browse/HADOOP-8191 > Project: Hadoop Common > Issue Type: Bug > Components: ha > Affects Versions: 0.23.3 > Reporter: Philip Zeyliger > Assignee: Todd Lipcon > Fix For: 0.24.0, 0.23.3 > > Attachments: hdfs-3081.txt > > > SshFencyByTcpPort currently assumes that the NN is listening on localhost. Typical setups have the namenode listening just on the hostname of the namenode, which would lead "nc -z" to not catch it. > Here's an example in which the NN is running, listening on 8020, but doesn't respond to "localhost 8020". > {noformat} > [root@xxx ~]# lsof -P -p 5286 | grep -i listen > java 5286 root 110u IPv4 1772357 TCP xxx:8020 (LISTEN) > java 5286 root 121u IPv4 1772397 TCP xxx:50070 (LISTEN) > [root@xxx ~]# nc -z localhost 8020 > [root@xxx ~]# nc -z xxx 8020 > Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded! > {noformat} > Here's the likely offending code: > {code} > LOG.info( > "Indeterminate response from trying to kill service. " + > "Verifying whether it is running using nc..."); > rc = execCommand(session, "nc -z localhost 8020"); > {code} > Naively, we could rely on netcat to the correct hostname (since the NN ought to be listening on the hostname it's configured as), or just to use fuser. Fuser catches ports independently of what IPs they're bound to: > {noformat} > [root@xxx ~]# fuser 1234/tcp > 1234/tcp: 6766 6768 > [root@xxx ~]# jobs > [1]- Running nc -l localhost 1234 & > [2]+ Running nc -l rhel56-18.ent.cloudera.com 1234 & > [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234 > nc 6766 root 3u IPv4 2563626 TCP localhost:1234 (LISTEN) > nc 6768 root 3u IPv4 2563671 TCP xxx:1234 (LISTEN) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira