Return-Path: X-Original-To: apmail-cloudstack-dev-archive@www.apache.org Delivered-To: apmail-cloudstack-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 93F6BD98D for ; Wed, 15 May 2013 14:28:31 +0000 (UTC) Received: (qmail 71709 invoked by uid 500); 15 May 2013 14:28:31 -0000 Delivered-To: apmail-cloudstack-dev-archive@cloudstack.apache.org Received: (qmail 71560 invoked by uid 500); 15 May 2013 14:28:31 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 71535 invoked by uid 99); 15 May 2013 14:28:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 May 2013 14:28:30 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jburwell@basho.com designates 209.85.223.170 as permitted sender) Received: from [209.85.223.170] (HELO mail-ie0-f170.google.com) (209.85.223.170) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 May 2013 14:28:24 +0000 Received: by mail-ie0-f170.google.com with SMTP id aq17so3866170iec.1 for ; Wed, 15 May 2013 07:28:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type:x-gm-message-state; bh=h5FRUKfPlxnXL5uUSmLtcRscRGonQRziXJWDm0ehLX8=; b=R/XeuG/idV5OF7kQJ98bBJquoj/dEnNEPtnaHewnOPPf1BMw3hYuFCvv922c8pz6u6 OQq7kt5OH0hI0NXqSePyKtBYORblz6hq3tLlEiS4pze9gyAdoX/6QImsSUMJyX742BE2 aSNvwrQf77FYIRwRxUFeYVxZHeOq+DSGFZT210YKOTcqCDH4HhHN7/WuHdT1lbMEUyCu W1V/aw5ZW4u5TkM6zvPL403OzEccAT2gL0LeROYYOwbTMihfyn2S7DulEUqwl8+X0s9m HX0lWUicYe3tJ1Mg6Q83/M+mb8QkkE6m5dwQY8tXzVhTztTXnb2avVvmUMFHNAXBzZ6Q TnMg== MIME-Version: 1.0 X-Received: by 10.50.79.233 with SMTP id m9mr5721477igx.53.1368628082059; Wed, 15 May 2013 07:28:02 -0700 (PDT) Received: by 10.64.230.10 with HTTP; Wed, 15 May 2013 07:28:01 -0700 (PDT) In-Reply-To: <20130515141803.GZ24552@USLT-205755.sungardas.corp> References: <20130515141803.GZ24552@USLT-205755.sungardas.corp> Date: Wed, 15 May 2013 10:28:01 -0400 Message-ID: Subject: Re: [ACS41] System VMs not syncing time - does this block the release? From: John Burwell To: "dev@cloudstack.apache.org" Content-Type: multipart/alternative; boundary=089e0139fffca93e0004dcc28e0c X-Gm-Message-State: ALoCoQkg9jMQj9hKBS1DNf9hOq8RE8VZ00/b6NQstAg8tvrFos/uGTjpRA8cJj5tQs+f4TxTAQ9g X-Virus-Checked: Checked by ClamAV on apache.org --089e0139fffca93e0004dcc28e0c Content-Type: text/plain; charset=ISO-8859-1 Chip, The issues with clock drift on the system VMs goes farther and deeper than S3-backed Secondary Storage. Essentially, anything the system vms do that involves time can not be trusted. For example, the timestamps of files written by the SSVM. Bear in mind that it is possible for a system vm to have a slow clock. Therefore, in a worst case scenario, the timestamp of the file would be in the past breaking any logic that scans for updated files. Additionally, correlating logs on a system VM with the management server or other parts of the infrastructure is difficult to impossible in these types of clock drift scenarios. In summary, when time in a distributed system gets skewed, there are a raft of subtle but significant operational issues that emerge. It is also important to note that fixing this issue is larger than simply running NTP on the system VMs. As I noted on the ticket, each hypervisor has a recommended approach for ensuring clock synchronization (e.g. VMWare and KVM provide daemons and/or kernel drivers to sync clocks properly). The proper fix for the issue will be to implement those best practices in each hypervisor-specific system VM ISO. I think the biggest challenge to implementing the fix will be testing more than development. Thanks, -John On Wed, May 15, 2013 at 10:18 AM, Chip Childers wrote: > Starting a thread on this specific issue. > > CLOUDSTACK-2492 was opened, which is basically the fact that the System > VMs aren't syncing time to the host or to an NTP server. The S3 > integration is broken because of this problem, and therefore could not > be considered a function available in 4.1 if we release as is. > > We need input from people that know about the current system VMs (the > 3.x VMs), as well as the possibility of using the newer ones that we > have been considering experimental for 4.1.0. > > What should we do? > > -chip > --089e0139fffca93e0004dcc28e0c--