cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daan Hoogland <daan.hoogl...@gmail.com>
Subject Re: irc meeting on rvr4vpc
Date Mon, 16 Jun 2014 18:22:46 GMT
H Karl,

We will have a look at this with the team at Schuberg Philis, thanks.

On Mon, Jun 16, 2014 at 5:46 PM, Karl Harris <karl.harris@sungardas.com> wrote:
> Outlined below is an overview of the analysis and in process work for
> adding Virtual Redundant Routing to CloudStack Virtual Private Clouds.
>
>
> Current state:
>
> Public networks allow for redundant VR (Virtual Routers).
> The network topology is static. The network topology consists of a single
> guest
> network and a public network. The VR redundancy is implemented using a
> second VR
> and KeepAlived and Conntrackd software packages configured to match the
> static
> network topology. The static network topology is configured using parameters
> passed to the VR Linux image using /proc/cmdline. The /proc/cmdline line
> information
> is parsed by a shell script called cloud-early-config.sh. The appropriate
> parameters
> required for VR redundant setup are used to configured the KeepAlived and
> Conntrackd
> packages if a boolean named redundant_router is true(1). While changes can
> be made
> to the guest network using the script guestnw.sh; this shell script assumes
> the topology
> of the network will not change.
>
> Desired state:
>
> Redundant VR's should be available in not only Public Networks but also
> Virtual Private Clouds (VPC)
> under CloudStack.
>
> Issues:
>
> VPC allow for, among other things, networks with a dynamic topology, aka
> tiers.
>
> Current shells scripts that configure redundant VR's in public networks
> cannot
> Create, Read, Update, or Delete (CRUD) virtual networks and VR's because of
> the
> the script(s) and programs used to change the guest network assume a static
> network topology.
>
> Minimal unit testing for CRUD functions.
>
>
> Current Work in Process:
>
> Generate Unit tests to Create, Read, Update and Delete for both redundant
> and non redundant
> networks using a single System Vritual Machine image VM based on changes to
> script files
> below.
>
> Generate Unit test to verify VPCVirtualNetworkApplianceManagerImpl.java
> will work with CRUD
> enabled scripts.
>
> Add the interface and ability to do network CRUD to guestnw.sh and
> cloud-early-config.sh.
>
> Modify VPCVirtualNetworkApplianceManagerImpl to allow changes to guest
> network using
> new CRUD functionality of guesnw.sh mentioned above.
>
> Modify UI and other Java code as required for implantation of VPC redundant
> routers.
>
> Karl
>
>
> On Wed, Jun 11, 2014 at 3:16 PM, Sheng Yang <sheng@yasker.org> wrote:
>
>> One note:
>>
>> In fact the split of MASTER is not a big issue, because that would only
>> happen if network runs bad enough, which already cause packet loss.
>>
>> The problem is it should recover from that situation fast enough.
>> Previously due to ARP ping from BACKUP router(which thought it would
>> replace MASTER), upstream switch would redirect the traffic to original
>> BACKUP router for a while, then as soon as network recovered, MASTER would
>> preempt BACKUP once again. But it may take some time for upstream switch to
>> aware that MAC/Port/IP mapping has been changed. We once tried different
>> MAC for MASTER and BACKUP but found it would result in upstream switch fail
>> to recognize the MASTER again. Now we're still using same MAC for MASTER
>> and BACKUP, and upstream switch can handle the situation better.
>>
>> --Sheng
>>
>>
>> On Wed, Jun 11, 2014 at 12:48 AM, Daan Hoogland <
>> DHoogland@schubergphilis.com> wrote:
>>
>> > H,
>> >
>> > We had a little meeting on the state of this feature and the way to go. I
>> > have no karma for ASFBot meetings so here is my excerpt from the
>> transcript:
>> >
>> > Attendance:
>> > K3KH Karl Harris
>> > Yasker Sheng Yang
>> > Spark404 Hugo Trippaers
>> > echaz Eric Chazas
>> > LeoSimons Leo Simons
>> > dahn Daan Hoogland
>> >
>> > others where present in the room but not active in the meeting
>> >
>> > Agenda:
>> > -          Feasibility experiment plans by Schuberg Philis
>> > -          Reusable work by Karl
>> > -          Problems Citrix encountered with the regular redundant router
>> > (and how to avoid them)
>> > -          Work division
>> > -          (next meeting needed?)
>> >
>> > We tried to follow the agenda but were not very strict on it. I'll
>> > summarize outcome per agenda bullet:
>> >
>> > Schuberg Philis wants to implement a feasibility redundant router on a
>> > simulated vpc environment using the operational expertise it has in
>> house.
>> > The outcome would then be back ported to the device, it's agent and the
>> > management server.
>> >
>> > The implementation tactics is to create a json like configuration
>> > description and to let the device do its own configuration. The idea is
>> to
>> > have a single device for normal and vpc routers and to let the redundancy
>> > be a mere property of it. This should lead to the ultimate objective
>> which
>> > is to have a single relatively simple maintainable device.
>> >
>> > Karl will describe his endeavors in adapting the existing device on list.
>> >
>> > Sheng described the QA problems Citrix had with the existing redundant
>> > capabilities of the VR and assured us that only one real problem
>> persists.
>> > The failover time of 3 seconds occasionally leads to a split brain which
>> > leads to two VR's assuming the role of master. As the management server
>> in
>> > a busy environment can take up to 30 seconds the to detect a failover
>> this
>> > can lead to unacceptable outage. One possible solution, to have the
>> > management server serve as negotiator on such occasions, will be hard to
>> > implement due to this latency. Noticeably both routers use the same mac
>> > address on the interface to the load balancer.
>> >
>> > The resources available by Citrix are uncertain. Plan and design needs to
>> > be done. It is agreed that we will work in parallel (Schuberg Philis and
>> > Citrix) but keep in close contact. The amount of resources Sungard has
>> for
>> > this is not discussed. Karl will keep involved.
>> >
>> > We agreed to have a next meeting at 20:00 UTC on June the 17th
>> >
>> > Can someone give me Karma to use ASFBot for this one, please?
>> >
>> > \DaanH
>> >
>> >
>>



-- 
Daan

Mime
View raw message