incubator-cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kelven Yang <>
Subject Re: [RFC] CloudStack Architectural Refactor
Date Sat, 11 Aug 2012 05:53:08 GMT
CloudStack conceptually manages how virtual resource is virtualized(or
call it realized) into the physical infrastructure. These management works

1) Tenant based configuration management for compute(VMs), network (user
logic networks), storage
2) Fabric(physical resource like hypervisor hosts, storage servers) level
configuration and resource management
3) Mapping management (the configuration of how virtual resources are
realized into physical fabric)
4) Resource allocation and orchestration of resource realization process

Virtualization technology is all about how virtual resources share common
physical infrastructure in an efficient, secure way. From networking
perspective, CloudStack enables mapping user local networks in bridged or
isolated way into physical network infrastructure. The major isolation
technologies that currently available are VLAN technology and "L2 over L3"
technologies. In CloudStack, as we not only provide L2 resource allocation
(for example, allocation of VLANs), but also L3 resource allocation as
well (i.e., IP addresses), so underlying isolation technology will affect
the way on how these resources are allocated.

NetworkGuru was designed with the intention to allow third-party vendors
to plugin different isolation technologies into CloudStack, one part of
work in NetworkGuru is to help CloudStack in resource allocation phase (IP
allocation etc) which may be affected by underlying isolation technology.

I had quite a few debates internally with Alex before. The major debate
point is whether or not we can model "isolation" itself so that we can get
rid of NetworkGuru at all and leave the realization details to network
components that implement the isolation technology (As Tomeo is wondering
why not here). The key point is to think these different isolation
technologies as just insignificant realization details of a network tenant
(be it either VLAN or infinite of "L2 over L3" isolation technologies,
does not matter), what CloudStack cares is just a tenant ID, basically,
CloudStack framework can take over what current NetworkGuru does currently,

1) IP allocation based on whether it is tenant based or non-tenant based
2) Tenant key allocation (VLAN ids or other "L2 over L3" tenant keys)

For #1, as bridged realization requires knowledge on how physical
infrastructure is configured, so to really make this possible, we should
unify how CloudStack views a physical network configuration and a logic
user network configuration. For example, when a user network is bridged
into a physical network, IP address should be allocated from physical
side. So to further make the idea complete, a physical network and a user
logical network have to be unified in terms of how to describe it in
CloudStack, I would love to see CloudStack to get rid of some obstacle
terms like Pod, basic zone, advanced zone, "physical" network, public IP,
private IP, etc (you name it!), and back to talk terms in subnet (be it a
physical subnet in the data center or a logic subnet in user's logic
network configuration), IP on a subnet (be it private, or public),
majority of these obstacle terms are really just very high-level ever
changing configuration types, giving it special terms in CloudStack makes
a lot of confusion for people to understand the very basics.

As a summary, only if we have an abstracted model on top of this, can we
let component to focus on what it is supposed to do, to realize the
technology itself instead of to participant into every orchestration phase
which it may not really care about, and most of time, the resource
allocation phase can change a lot based on different mapping
configurations that user may add into the system, and our current
NetworkGuru has fallen into such a situation.

Comments? Flames? We are currently working on re-archtecting CloudStack
into a more open, better platform, please speak out loudly and let your
comments be heard. 


On 8/10/12 7:36 AM, "Tomoe Sugihara" <> wrote:

>Sorry for commenting to this thread during the busiest time right
>before the release,
>But I'd like to add my thoughts especially on networking part while my
>memory is fresh since I started
>working on integration our networking technology from Midokura to CS a
>couple of weeks ago.
>I'm still new, so please correct me if I'm wrong, and I'd be happy to
>hear any comments, suggestions, and
>constructive criticism:)
>My comments below.
>On Thu, Aug 9, 2012 at 1:38 PM, Alex Huang <> wrote:
>> CloudStack 3.x has quite a bit of baggage after years of development.
>>This makes it difficult for new committers to join the project.  Even
>>for existing committers, it is easy to make mistakes.  We're looking to
>>make some fundamental changes to CloudStack to overcome these problems.
>>Please feel free to comment.
>> Problems:
>>   - No clear separation between orchestration and the subsystems being
>>orchestrated.  We've done a lot of work in this area in networking.
>>However, storage and hypervisor  are still missing.
>>   - No clear separation between orchestration and the ACL system.
>>   - No clear separation between orchestration and the extra services
>>(read SSVM, console proxy VM)
>>   - CloudStack believes it is the master of all physical resources when
>>it is not.
>     - No clear separation between orchestration of network and
>underlying networking technology.
>> Goals:
>>   - Form a core orchestration engine that is free of unnecessary
>>artifacts such as ACL.
>>   - Clear definition of functionalities and how to contribute to
>>different parts of CloudStack.
>>   - Allow developers to write and deploy services on the CloudStack
>>orchestration server.
>     - Allow network vendors to plug in their networking technology
>for CS deployment.
>> Design:
>> CloudStack will be refactor into the following functional areas.
>>Details to follow after the design.
>> Orchestration Platform will include the following:
>>   - Orchestration Framework:  Orchestration engine is key in cloud
>>   - Subsystem Plugins: Actual implementations in network, hypervisor,
>>and storage of the orchestration steps.
>>   - Services Framework: A framework for others developers to build
>>additional services that can be launched using the Orchestration Service.
>>   - Events Framework: Allow third-party components to integrate with
>>Orchestration Platform through events only.
>> On top of the Orchestration Platform, CloudStack will come with the
>>following Services which can be enabled/disabled without affecting
>>CloudStack orchestration functionalities.  Each of these services can be
>>deployed on bare metal, VM, or as a webapp.
>>   - ACL Service to provide advance ACL control to other services
>>   - End User API Service to provide orchestration of end user VMs
>>   - ConsoleProxy Service to provide console access
>>   - Template Service to provide template/ISO transport and access
>>within a zone
>>   - Router Service to provide routing services to end user VMs
>>   - Object Store Service to provide object store functionality
>>   - EC2 API Service
>> Details:
>> Orchestration Platform
>>   - Continues to manage hardware resources
>>   - Provide administrative API/UI
>>   - Minimal ACL provided
>>   - Provides a Services Directory for services to find each other
>>   - Includes the following components
>> Orchestration Framework
>>   - Orchestrates VMs, Volumes, Templates, Networks operations within a
>>data center
>>   - Ensures the operations are idempotent
>>   - Deals with hiccups and maintenances needs of the system
>>administrator so that services built on top do not have to.
>>   - Breaks down cloud operations into steps and pushes the steps to the
>>subsystems to be perform but does not actually perform the steps.
>> Subsystem Plugins:
>>   - Plugins that knows how to perform different functionalities that
>>the orchestration framework needs.
>>   - Current Subsystem Planned: Deployment Planning, NetworkGuru,
>>NetworkElement, Storage, Snapshot, Backup (see Edison's RFC)
>  One current problem of NetowrkGuru and NetworkElement is that, even
>though they are pluggable
> in a sense that you can specify classes in the config at runtime, the
>orchestration and implementation seems
>tightly coupled.  And the assumption of implementation is pretty much
>VLAN or not.
>Take NetworkGuru, there exist a guru for each type of network(Guest,
>Control, etc.).
> So it looks like they are supposed to be extended if you want to have
>a different networking
>technology. Suppose one wanted to use a different networking isolation
>technology for different network types, (s)he would need to have
>extended classes for every one of the network types.
>This can easily go out of control when vendors try to push their
>technologies for each network types.
>I'd argue that Gurus should be delegating actual implementation of
>networks to a vendor specific plugin.
>This means Gurus will have has-a relationship with vendor specific
>code; I need to think about it more how
>exactly it is possible.
>Also, at the hypervisor resource sides, though I only had a look at
>Xen and KVM, networking part is tightly
>coupled in various places like handling start/migration commands. I
>submitted a patch for KVM to partly
>address the issue and started a discussion for xen about a week ago,
>but IMO they should be
>cleanly designed and refactored.
>Hope this makes sense to you and I'll keep looking into the code
>including upper layers than L2.
>Hopefully I can have more concrete suggestions.
>> Services Framework:  (RFC upcoming)
>>   - Leverages the current system VM to allow more services to be built.
>> Events Framework: (RFC upcoming)
>>   - Publishes all Orchestration platform entity changes as events.
>>   - Allow other components outside of Orchestration Platform to
>>interact with Orchestration Platform only through event notifications.
>> Design Thoughts:
>>   - In this design, end user API and end user VMs is one particular
>>service provided.  End User VM services can innovate independently from
>>the Orchestration Platform.
>>   - Domains and Accounts will reside with the ACL service which can
>>integrate with Active Directory and LDAP services.
>> --Alex

View raw message