hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3618) SSH fencing option may incorrectly succeed if nc (netcat) command not present
Date Wed, 13 Mar 2013 05:26:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600841#comment-13600841
] 

Uma Maheswara Rao G commented on HDFS-3618:
-------------------------------------------

{code}
 if (!exec.isClosed()) {
+        outPumper.start();
+        errPumper.start();
{code}
Even after check passed, command can be closed. So, this threads can hang again?
Introducing timeout for commands should be the option and check thread  is alive or not? (I
think we did this in our internal brnach also right?)
                
> SSH fencing option may incorrectly succeed if nc (netcat) command not present
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-3618
>                 URL: https://issues.apache.org/jira/browse/HDFS-3618
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: auto-failover
>    Affects Versions: 2.0.0-alpha
>            Reporter: Brahma Reddy Battula
>            Assignee: Vinay
>         Attachments: HDFS-3618.patch, zkfc_threaddump.out, zkfc.txt
>
>
> Started NN's and zkfc's in Suse11.
> Suse11 will have netcat installation and netcat -z will work(but nc -z wn't work)..
> While executing following command, got command not found hence rc will be other than
zero and assuming that server was down..Here we are ending up without checking whether service
is down or not..
> {code}
> LOG.info(
>             "Indeterminate response from trying to kill service. " +
>             "Verifying whether it is running using nc...");
>         rc = execCommand(session, "nc -z " + serviceAddr.getHostName() +
>             " " + serviceAddr.getPort());
>         if (rc == 0) {
>           // the service is still listening - we are unable to fence
>           LOG.warn("Unable to fence - it is running but we cannot kill it");
>           return false;
>         } else {
>           LOG.info("Verified that the service is down.");
>           return true;          
>         }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message