Return-Path: X-Original-To: apmail-cloudstack-dev-archive@www.apache.org Delivered-To: apmail-cloudstack-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CC6AFFA46 for ; Tue, 30 Apr 2013 23:53:16 +0000 (UTC) Received: (qmail 40151 invoked by uid 500); 30 Apr 2013 23:53:16 -0000 Delivered-To: apmail-cloudstack-dev-archive@cloudstack.apache.org Received: (qmail 40113 invoked by uid 500); 30 Apr 2013 23:53:16 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 40105 invoked by uid 99); 30 Apr 2013 23:53:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Apr 2013 23:53:16 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Chiradeep.Vittal@citrix.com designates 66.165.176.63 as permitted sender) Received: from [66.165.176.63] (HELO SMTP02.CITRIX.COM) (66.165.176.63) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Apr 2013 23:53:10 +0000 X-IronPort-AV: E=Sophos;i="4.87,584,1363132800"; d="scan'208";a="21453964" Received: from sjcpmailmx02.citrite.net ([10.216.14.75]) by FTLPIPO02.CITRIX.COM with ESMTP/TLS/RC4-MD5; 30 Apr 2013 23:52:49 +0000 Received: from SJCPMAILBOX01.citrite.net ([10.216.4.72]) by SJCPMAILMX02.citrite.net ([10.216.14.75]) with mapi; Tue, 30 Apr 2013 16:52:48 -0700 From: Chiradeep Vittal To: "dev@cloudstack.apache.org" Date: Tue, 30 Apr 2013 16:52:40 -0700 Subject: Re: Virtual Router: DHCP and 2-second DNS outages Thread-Topic: Virtual Router: DHCP and 2-second DNS outages Thread-Index: Ac5F/dE8jxKIS9cQS2qOrKNlJR5y/w== Message-ID: In-Reply-To: <040f01ce45f1$b4550a30$1cff1e90$@gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.2.130206 acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org On 4/30/13 3:26 PM, "Dennis Lawler" wrote: >Every time a new VM is started up, there is a 2 second outage in DNS >services that can cause problems in guest VMs that use the router VM for >DNS. > >=20 > >For Cloudstack configurations using both DHCP and DNS services on the >router >VM (both implemented with dnsmasq), there is currently a 2 second DNS >service outage every time a new VM is instantiated > >=20 > >The source of this outage is in edithosts.sh, which uses "service dnsmasq >restart" to pick up the freshly added DNS and DHCP entries. > >Restarting the dnsmasq service triggers a sleep for 2 seconds after >killing >dnsmasq before starting it back up again. > >=20 > >An obvious solution would be to replace "service dnsmasq restart" with >"kill >-s 1 $pid" (SIGHUP) so that dnsmasq reads the new DHCP entries without >restarting, as in dnsmasq_edithosts.sh (external dhcp). >=20 > >Unfortunately, this solution is flawed because dnsmasq SIGHUP handling >does >not expire in-memory DHCP leases in dnsmasq and all leases are infinite by >default. Aha! That's why SIGHUP didn't work consistently. This has been bugging me for a long time. >Thus, this will only work if the guest VM performs a DHCP release on >shutdown, which cannot always be guaranteed. > >=20 > >A few possible solutions off the top of my head: > >1. Separate DNS and DHCP services. While DHCP services still >experience an outage during VM, DNS will not necessarily be impacted if >implemented correctly. > >2. Use SIGHUP with dnsmasq and implement a removeDhcpEntry interface >for network appliances to force a DHCP release whenever a NIC / IP is >deallocated. This can use dhcp_release to simulate a DHCP release on the >router VM. >Catch: dhcp_release is not available for Debian 6.0. The System VM needs >to >be updated to at least Debian 7.0, or the dnsmasq-tools .deb from 7.0 >would >need to be included in the System VM image. There is going to be a new system vm based on 7.0 for the upcoming release. This should work with earlier releases as well. https://cwiki.apache.org/confluence/x/UlHVAQ > >3. Change DHCP to have a shorter lease, track de-allocation of IPs >separately from VM destruction. >Catch: This may cause occasional IP pool exhaustion depending on >allocation >of the guest IP range and the rate of VM destruction / instantiation in >the >network. > >=20 > >Thoughts? >