Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1215B95D1 for ; Mon, 12 Mar 2012 23:53:04 +0000 (UTC) Received: (qmail 62852 invoked by uid 500); 12 Mar 2012 23:53:03 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 62817 invoked by uid 500); 12 Mar 2012 23:53:03 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 62809 invoked by uid 99); 12 Mar 2012 23:53:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Mar 2012 23:53:03 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Mar 2012 23:53:01 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id A736F18FB4 for ; Mon, 12 Mar 2012 23:52:40 +0000 (UTC) Date: Mon, 12 Mar 2012 23:52:40 +0000 (UTC) From: "Eli Collins (Updated) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <277651547.5672.1331596360716.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <529097072.5638.1331595760249.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HDFS-3081) SshFenceByTcpPort uses netcat incorrectly MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3081: ------------------------------ Target Version/s: 0.23.3 (was: 0.24.0) > SshFenceByTcpPort uses netcat incorrectly > ----------------------------------------- > > Key: HDFS-3081 > URL: https://issues.apache.org/jira/browse/HDFS-3081 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha > Affects Versions: 0.24.0 > Reporter: Philip Zeyliger > Assignee: Todd Lipcon > > SshFencyByTcpPort currently assumes that the NN is listening on localhost. Typical setups have the namenode listening just on the hostname of the namenode, which would lead "nc -z" to not catch it. > Here's an example in which the NN is running, listening on 8020, but doesn't respond to "localhost 8020". > {noformat} > [root@xxx ~]# lsof -P -p 5286 | grep -i listen > java 5286 root 110u IPv4 1772357 TCP xxx:8020 (LISTEN) > java 5286 root 121u IPv4 1772397 TCP xxx:50070 (LISTEN) > [root@xxx ~]# nc -z localhost 8020 > [root@xxx ~]# nc -z xxx 8020 > Connection to xxx 8020 port [tcp/intu-ec-svcdisc] succeeded! > {noformat} > Here's the likely offending code: > {code} > LOG.info( > "Indeterminate response from trying to kill service. " + > "Verifying whether it is running using nc..."); > rc = execCommand(session, "nc -z localhost 8020"); > {code} > Naively, we could rely on netcat to the correct hostname (since the NN ought to be listening on the hostname it's configured as), or just to use fuser. Fuser catches ports independently of what IPs they're bound to: > {noformat} > [root@xxx ~]# fuser 1234/tcp > 1234/tcp: 6766 6768 > [root@xxx ~]# jobs > [1]- Running nc -l localhost 1234 & > [2]+ Running nc -l rhel56-18.ent.cloudera.com 1234 & > [root@xxx ~]# sudo lsof -P | grep -i LISTEN | grep -i 1234 > nc 6766 root 3u IPv4 2563626 TCP localhost:1234 (LISTEN) > nc 6768 root 3u IPv4 2563671 TCP xxx:1234 (LISTEN) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira