cloudstack-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Scott (JIRA)" <>
Subject [jira] [Created] (CLOUDSTACK-6621) Intermittent failure when management server connects to hypervisor via ssh
Date Sat, 10 May 2014 22:16:23 GMT
David Scott created CLOUDSTACK-6621:

             Summary: Intermittent failure when management server connects to hypervisor via
                 Key: CLOUDSTACK-6621
             Project: CloudStack
          Issue Type: Bug
      Security Level: Public (Anyone can view this level - this is the default.)
          Components: Management Server
    Affects Versions: 4.5.0
         Environment: I'm running a management server locally (from master c/s 6511b96088af75b7e37a5f8b0cce609b006021fb)
and attempting to add a CentOS 6.4 host via the libvirt/KVM plugin
            Reporter: David Scott

The management server attempts to verify the presence of kvm by using ssh to talk to the host
via sshExecuteCmd:

The work is done by sshExecuteCmdOneShotWithExitCode (called in a loop)

This function waits until either EXIT_STATUS or EOF is set, and then calls sshSession.getExitStatus.
For me this fails with a NullPointerException
ERROR [c.c.u.s.SSHCmdHelper] (581293855@qtp-1130716142-0:ctx-57482224 ctx-b2286596 ctx-e73d2678)
Ssh executed failed

I added some extra logging and I believe that EOF can be set *before* EXIT_STATUS i.e. before
the exit status is ready. I think if we want there to be a readable exit code, we must wait

Perhaps my system has unusual timing, but this hits me every time. Note the ssh command is
repeated multiple times (e.g. 3) which could hide the bug for many people.

I've prepared a simple patch which fixes the issue and makes ssh reliable for me. I'll upload
it to review board shortly.

This message was sent by Atlassian JIRA

View raw message