Return-Path: X-Original-To: apmail-cloudstack-issues-archive@www.apache.org Delivered-To: apmail-cloudstack-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3A0F612554 for ; Sat, 10 May 2014 23:12:28 +0000 (UTC) Received: (qmail 38958 invoked by uid 500); 10 May 2014 22:16:25 -0000 Delivered-To: apmail-cloudstack-issues-archive@cloudstack.apache.org Received: (qmail 38917 invoked by uid 500); 10 May 2014 22:16:25 -0000 Mailing-List: contact issues-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list issues@cloudstack.apache.org Received: (qmail 38761 invoked by uid 500); 10 May 2014 22:16:24 -0000 Delivered-To: apmail-incubator-cloudstack-issues@incubator.apache.org Received: (qmail 38360 invoked by uid 99); 10 May 2014 22:16:23 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 May 2014 22:16:23 +0000 Date: Sat, 10 May 2014 22:16:23 +0000 (UTC) From: "David Scott (JIRA)" To: cloudstack-issues@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (CLOUDSTACK-6621) Intermittent failure when management server connects to hypervisor via ssh MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 David Scott created CLOUDSTACK-6621: --------------------------------------- Summary: Intermittent failure when management server connects to hypervisor via ssh Key: CLOUDSTACK-6621 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-6621 Project: CloudStack Issue Type: Bug Security Level: Public (Anyone can view this level - this is the default.) Components: Management Server Affects Versions: 4.5.0 Environment: I'm running a management server locally (from master c/s 6511b96088af75b7e37a5f8b0cce609b006021fb) and attempting to add a CentOS 6.4 host via the libvirt/KVM plugin Reporter: David Scott The management server attempts to verify the presence of kvm by using ssh to talk to the host via sshExecuteCmd: https://github.com/apache/cloudstack/blob/master/utils/src/com/cloud/utils/ssh/SSHCmdHelper.java#L63 The work is done by sshExecuteCmdOneShotWithExitCode (called in a loop) https://github.com/apache/cloudstack/blob/master/utils/src/com/cloud/utils/ssh/SSHCmdHelper.java#L94 This function waits until either EXIT_STATUS or EOF is set, and then calls sshSession.getExitStatus. For me this fails with a NullPointerException {noformat} ERROR [c.c.u.s.SSHCmdHelper] (581293855@qtp-1130716142-0:ctx-57482224 ctx-b2286596 ctx-e73d2678) Ssh executed failed java.lang.NullPointerException {noformat} I added some extra logging and I believe that EOF can be set *before* EXIT_STATUS i.e. before the exit status is ready. I think if we want there to be a readable exit code, we must wait for EXIT_STATUS. Perhaps my system has unusual timing, but this hits me every time. Note the ssh command is repeated multiple times (e.g. 3) which could hide the bug for many people. I've prepared a simple patch which fixes the issue and makes ssh reliable for me. I'll upload it to review board shortly. -- This message was sent by Atlassian JIRA (v6.2#6252)