Return-Path: Delivered-To: apmail-incubator-deltacloud-dev-archive@minotaur.apache.org Received: (qmail 10755 invoked from network); 24 Feb 2011 16:23:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Feb 2011 16:23:44 -0000 Received: (qmail 95966 invoked by uid 500); 24 Feb 2011 16:23:44 -0000 Delivered-To: apmail-incubator-deltacloud-dev-archive@incubator.apache.org Received: (qmail 95856 invoked by uid 500); 24 Feb 2011 16:23:43 -0000 Mailing-List: contact deltacloud-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: deltacloud-dev@incubator.apache.org Delivered-To: mailing list deltacloud-dev@incubator.apache.org Received: (qmail 95848 invoked by uid 99); 24 Feb 2011 16:23:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Feb 2011 16:23:42 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of imain@redhat.com designates 209.132.183.28 as permitted sender) Received: from [209.132.183.28] (HELO mx1.redhat.com) (209.132.183.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Feb 2011 16:23:34 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p1OGNBWn006510 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 24 Feb 2011 11:23:11 -0500 Received: from len.mains.priv (ovpn-112-40.phx2.redhat.com [10.3.112.40]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p1OGN8jO016412 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Thu, 24 Feb 2011 11:23:11 -0500 Date: Thu, 24 Feb 2011 11:23:07 -0500 From: Ian Main To: aeolus-devel@lists.fedorahosted.org Cc: deltacloud-dev@incubator.apache.org Subject: Stateful Vs Stateless Instances Findings Message-ID: <20110224162306.GA20185@len.mains.priv> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-08-17) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-Virus-Checked: Checked by ClamAV on apache.org We had a meeting this morning and we managed to hash some things out on this subject. Here is the result. I am posting this to both aeolus-devel and deltacloud-devel as it impacts both. Stateful vs Stateless instances and the cloud --------------------------------------------- In an effort to support private cloud, we had to sort out what to do about 'stateful' instances. We're defining a stateful instance as one where the image for the instance does not get destroyed when the instance is stopped and it is possible to restart it with the same state. Also, the model that Aeolus Conductor is using regarding images is the same as ec2, where a given image can be used to launch as many instances as needed. This is not the same as the model used by many private cloud providers where an image is used directly by a instance/VM and it is not possible for it to be used by multiple running instances. We want to tackle these questions in two different phases. Stateless cloud on supported providers is presently the most important requirement, but we will need both soon. STATELESS CLOUD --------------- The way that some providers (such as rhevm and vmware) launch instances now are not what we need. Cloud launch must start an instance in such a way that it can perform multiple launches from the same image. To support that what really needs to happen is that prior to launch the image needs to be either cloned or snapshotted. Below are a couple of ways we thought of to do this: - Deltacloud API provider driver start call must clone disk on startup if the provider does not already do that. - On destroy, API must clean up cloned disk image if the provider does not already do it. - On shutdown, the API leaves the disk alone. OR - Deltacloud API provides instrospection into the driver and specifies whether or not an instance start will clone the disk for us. - It also needs to specify if the storage needs to be destroyed on instance shutdown (some clouds do, some don't). - Deltacloud API provides APIs for performing the clone and then cleanup of cloned images. - It is then up to the client to perform the right sequence of actions to get the desired behavior. Either of these will work and allow us to make the Conductor model consistent with existing public cloud providers. STATEFUL CLOUD INSTANCES ------------------------ Stateful instances have the ability to be stopped without destroying the disk image and then resumed from the same state. In order to support this we need: Condor: - Condor must have "suspend"/stop/restart support. - Image must be built to target stateful - we need a toggle at image build time and at launch time. Deltacloud API: - Deltacloud API must support stop that does not cleanup disk image but only if available. - Introspection as to whether or not a stop is valid. Currently stop/destroy is the same op on ec2, but 'stop' should be invalid. Stop is only valid for stateful instances. Destroy cleans up disk image. This is to act as introspection as to whether the instance is stateful or not. This may not be needed depending on how we support disk clone/cleanup as defined above. Conductor: - If an instance on a stateful-only cloud is designated stateless, we must track that extra condition in the conductor (because stateful/stateless will act the same). This lets us know if we need to clean up disk images etc. - We could provide a 'stateful/stateless' toggle at launch time that warned the user there is no match and they must build a stateful instance for it to work. Thanks! Ian