incubator-cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chiradeep Vittal <>
Subject Re: [RFC] QinQ vlans support
Date Mon, 22 Oct 2012 05:41:25 GMT
+1 on the FS.

On 10/20/12 10:52 PM, "Marcus Sorensen" <> wrote:

>The admin does have to create a new physical network, the patch just
>allows you to use a tagged network as that physical network rather
>than a real eth device. It is true that cloudstack doesn't know about
>q-in-q per se, but it is the one creating the q-in-q vlans. The admin
>does have to create any "vlan#" devs to be used, but I think that
>makes sense since cloudstack doesn't manage any of your physical
>network devices. Perhaps I need to write a bit of a functional spec
>just to describe it in more detail.
>I haven't done anything with it in regards to xen, of course that
>would also be a different patch since it hits different code. If
>someone knows that code well maybe they can help. This is a simple
>patch, but it's made possible by a previous patch that reworks how the
>bridges are named, so enabling it for xen might not be as simple as
>this makes it look.
>On Sat, Oct 20, 2012 at 10:57 PM, Chiradeep Vittal
><> wrote:
>> It looks like your patch does not require the admin to configure
>> wrt
>> physical networks. The admin knows the list of "outer" VLANs and
>> CloudStack is
>> blissfully unaware of the QinQ stuff.
>> This requires the hypervisors to be independently configured
>> with the
>> outer VLAN bridges ?
>> It also looks like this is a KVM-only solution.
>> Have you tried this with XS?
>> On 10/18/12 6:21 PM, "Marcus Sorensen" <> wrote:
>>>Ah, well it's pretty simple, so I'll just paste it here. Again,
>>>perhaps more should be implemented regarding the MTU (like
>>>functionality to configure MTU on the virtual router), but if you know
>>>what to do it can all work via switch configs.
>>>diff --git
>>>index 1bc70fa..70de3db 100755
>>>@@ -800,7 +800,7 @@ public class LibvirtComputingResource extends
>>>ServerResourceBase implements
>>>         String pif = Script.runSimpleBashScript("brctl show | grep "
>>>+ bridge + " | awk '{print $4}'");
>>>         String vlan = Script.runSimpleBashScript("ls /proc/net/vlan/" +
>>>-        if (vlan != null && !vlan.isEmpty()) {
>>>+        if (vlan != null && !vlan.isEmpty() &&
>>>(!pif.startsWith("vlan") || pif.matches("vlan\\d+\\.\\d+"))) {
>>>                 pif = Script.runSimpleBashScript("grep ^Device\\:
>>>/proc/net/vlan/" + pif + " | awk {'print $2'}");
>>>         }
>>>On Thu, Oct 18, 2012 at 8:05 AM, Chip Childers
>>><> wrote:
>>>> On Thu, Oct 18, 2012 at 12:42 AM, Marcus Sorensen
>>>>> Sorry, I've been up to my ears. I've attached the simple patch that
>>>>> makes this all happen, if anyone wants to take a look. This is the
>>>>> code that looks for physical devices. It's passed a bridge and then
>>>>> determines the parent of that bridge, then whether that parent is a
>>>>> tagged device and goes one more step and finds its parent. This just
>>>>> circumvents the last lookup if the parent of the bridge is a "vlan"
>>>>> device (single tagged, e.g. vlan100) but not a double-tagged one
>>>>> vlan100.10), and the rest of cloudstack treats vlan100 as though it
>>>>> were a physical device, creates tagged bridges on it if it has guest
>>>>> traffic type, etc. I've been using it in our test bed for about a
>>>>> month, and have only run into the MTU issue.
>>>> Hey Marcus,
>>>> Attachments get stripped.  Can you post it somewhere?
>>>>> If people still think it's a good idea, I'll create a functional spec
>>>>> and additional info on how it works.
>>>>>  I've also got a small patch to, but I'm debating
>>>>> whether or not it's necessary. It detects whether the "physical
>>>>> interface" is actually a vlan tagged interface, and if so it
>>>>> the necessary bytes from the MTU when it sets up the double-tagged
>>>>> bridges. It's technically not necessary, as the important part is
>>>>> whether the guest MTUs fit inside the MTU that the switch allows once
>>>>> the extra tag is added. But it just makes it a bit more obvious as to
>>>>> what's needed. However it also breaks the admin's ability to bump the
>>>>> switch MTUs up just a bit, say 1532, to account for the excess
>>>>> having to go up to 9000 or full jumbo. If anyone is a network guru
>>>>> has any feedback it would be appreciated, but I'm inclined to leave
>>>>> the MTUs alone and write it into the functional spec that a switch
>>>>> with a 1500 MTU supports double tags up to 1468, and a switch with a
>>>>> 9000 MTU supports VM guest networks up to 8968 MTU.
>>>>> On Mon, Oct 15, 2012 at 1:43 PM, Marcus Sorensen
>>>>>> Ok, I'll pull out the changes and let people see them. Cloudstack
>>>>>> seems to let me put the same vlan ranges on multiple physicals,
>>>>>> I haven't done much actual testing with large numbers of vlans. I
>>>>>> imagine there would be other bottlenecks if they all needed to be
>>>>>> on the same host at once. Luckily we only create bridges for the
>>>>>> actual VMs on the box so it should scale reasonably.
>>>>>> The only caveat I've run into so far is that you either need to be
>>>>>> running jumbo frames on your switches, or turn down the MTU on the
>>>>>> guests a bit to accommodate the space taken by extra tag.  If you
>>>>>> wanted to run jumbo fames on the guests as well, you'd run into the
>>>>>> same situation and have to use slightly less than the 9000 (although
>>>>>> the virtual router would require a patch too for the new size).
>>>>>> On Mon, Oct 15, 2012 at 9:56 AM, Ahmad Emneina
>>>>>><> wrote:
>>>>>>> On 10/15/12 8:35 AM, "Kelceydamage@bbits" <>
>>>>>>>>That's a far more elegant way then I tried, which was creating
>>>>>>>>interfaces within guests.
>>>>>>>>Sent from my iPhone
>>>>>>>>On Oct 15, 2012, at 12:54 AM, Chiradeep Vittal
>>>>>>>><> wrote:
>>>>>>>>> This sounds like it can be modeled as multiple physical
>>>>>>>>> each "outer" vlan (400, 401, etc) is a separate physical
>>>>>>>>>in the
>>>>>>>>> same zone. That could work, although it is probable that
the zone
>>>>>>>>> configuration API bits prevent more than 4k VLANs per
zone (that
>>>>>>>>>can be
>>>>>>>>> changed to per physical network).
>>>>>>>>> As long as communication between guests on different
>>>>>>>>> happens via the public network, it should be Ok.
>>>>>>>>> I'd like to see the patch.
>>>>>>>>> Thanks
>>>>>>>>> On 10/12/12 1:09 AM, "Marcus Sorensen" <>
>>>>>>>>>> Guys, in looking for a free and scalable way to provide
>>>>>>>>>> for customers I've been running a QinQ setup that
has been
>>>>>>>>>> well. I've sort of laid the groundwork for it already
>>>>>>>>>> bridge naming conventions about a month ago for KVM(to
>>>>>>>>>> collide if the same vlans is used twice on different
>>>>>>>>>> Basically the way it works is like this. Linux has
two ways of
>>>>>>>>>> tagged networks, the eth#.# and the less used vlan#
>>>>>>>>>>devices. I
>>>>>>>>>> have
>>>>>>>>>> a tiny patch that causes cloudstack to treat vlan#
devs as
>>>>>>>>>> were
>>>>>>>>>> physical NICs. In this way, you can do something
like physical
>>>>>>>>>> eth0,eth1,and vlan400. management traffic on eth0's
>>>>>>>>>>storage on
>>>>>>>>>> eth1.102's bridge, maybe eth1.103 for public/guest,
then create
>>>>>>>>>>say a
>>>>>>>>>> vlan400 that is tag 400 on eth1. You add a traffic
type of guest
>>>>>>>>>>to it
>>>>>>>>>> give it a vlan range, say 10-4000. Then you end up
>>>>>>>>>> out vlan400.10, vlan400.11, etc for guest networks.
Works great
>>>>>>>>>> network
>>>>>>>>>> isolation without burning through a bunch of your
"real" vlans.
>>>>>>>>>>In the
>>>>>>>>>> unlikely event that you run out, you just create
a physical
>>>>>>>>>>vlan401 and
>>>>>>>>>> start over with the vlan numbers.
>>>>>>>>>> In theory all-you-can-eat isolated networks without
having to
>>>>>>>>>> hundreds of vlans on your networking equipment. This
may require
>>>>>>>>>> additional
>>>>>>>>>> config on any upstream switches to pass the double
tags around,
>>>>>>>>>>but in
>>>>>>>>>> general from what I've seen the inner tags just pass
through on
>>>>>>>>>> layer 2, it should only get tricky if you try to
tunnel, route
>>>>>>>>>> tags.
>>>>>>>>>> This is especially nice with system VM routers and
>>>>>>>>>> care of everything), but admittedly external routers
>>>>>>>>>>will have
>>>>>>>>>> spotty support for being able to route double tagged
stuff. I'm
>>>>>>>>>>also a
>>>>>>>>>> afraid that if I were to get it merged in that it
would just
>>>>>>>>>> undocumented hack thing that few know about and nobody
uses. So
>>>>>>>>>> looking
>>>>>>>>>> for feedback on whether this sounds useful enough
to commit, how
>>>>>>>>>> be documented, and whether it makes sense to hint
at this in the
>>>>>>>>>> somehow.
>>>>>>> +1
>>>>>>> This actually sounds amazing Marcus. I'd love to see and use
>>>>>>> implementation.
>>>>>>> --
>>>>>>> Æ

View raw message