cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darren Shepherd <darren.s.sheph...@gmail.com>
Subject Re: CS-Management HA Networking
Date Sat, 26 Oct 2013 17:08:07 GMT
Glad that helped.  Seems that we should change CloudStack to ignore mac addresses that are
00:..:00.  If you want to put in a bug you can assign it me and I'll look into changing that.

Darren

> On Oct 26, 2013, at 5:35 AM, Marty Sweet <msweet.dev@gmail.com> wrote:
> 
> Hi Darren, thanks for the heads up about that script.
> 
> Old Networking Setup:
> eth0 eth1 -> management0
> management0.11 -> vlan11
> management0.12 -> vlan12
> 
> Turns out in true Ubuntu Networking fashion bond0 was being created for no
> reason and was appearing in ifconfig -a (so the script was pulling out the
> first mac address it found), although it was not active and could not be
> downed.
> 
> bond0     Link encap:Ethernet  HWaddr 00:00:00:00:00:00
>          BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
>          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>          collisions:0 txqueuelen:0
>          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)
> 
> Under this configuration the script returned:
> addr in integer is 0
> addr in bytes is  0 0 0 0 0 0
> addr in char is 00:00:00:00:00:00
> 
> 
> Once I used bond0 as my bond name, opposed to management0, it started
> working, as the bond was now in use.
> Old Networking Setup:
> eth0 eth1 -> bond0
> bond0.11 -> vlan11
> bond0.12 -> vlan12
> 
> Many thanks,
> Marty
> 
> 
>> On Wed, Oct 23, 2013 at 7:44 AM, Marty Sweet <msweet.dev@gmail.com> wrote:
>> 
>> Hi Darren,
>> 
>> Thanks for getting back to me. I will set the networking config up again
>> and run the commands you sent me over the next couple of days.
>> 
>> Thanks,
>> Marty
>> 
>> 
>> On Tue, Oct 22, 2013 at 11:39 PM, Darren Shepherd <
>> darren.s.shepherd@gmail.com> wrote:
>> 
>>> Well that wasn't very useful message.  If you can find the cloud-utils
>>> jar on your server run
>>> 
>>> java -cp <PATH>/cloud-utils-4.1.1.jar com.cloud.utils.net.MacAddress
>>> 
>>> That will output what its finding for the mac address.  Also run an
>>> "ifconfig -a" from the command line.  If you won't mind sending the
>>> output of "ifconfig -a" that would be helpful to see what's going
>>> wrong.
>>> 
>>> Darren
>>> 
>>> On Tue, Oct 22, 2013 at 2:48 PM, Marty Sweet <msweet.dev@gmail.com>
>>> wrote:
>>>> Just noticed I didn't include the log:
>>>> 
>>>> http://pastebin.com/wUtCsSAb
>>>> 
>>>> Marty
>>>> 
>>>> 
>>>>> On Tue, Oct 22, 2013 at 10:38 PM, Marty Sweet <msweet.dev@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi Darren,
>>>>> 
>>>>> Maybe I'm getting confused with an issue I had with the Agents around
>>> that
>>>>> time!
>>>>> The error message I got was very cryptic. Having a fresh look at the
>>>>> source code:
>>> https://github.com/apache/cloudstack/blob/04cdd90a84f4be5ba02778fe0cd352a4b1c39a13/utils/src/org/apache/cloudstack/utils/identity/ManagementServerNode.java
>>>>> 
>>>>> Would suggest that it gets: private static final long s_nodeId =
>>>>> MacAddress.getMacAddress().toLong(); and ensures it's <=0 in the
>>> check()
>>>>> function, which is run by the SystemIntegrityChecker.
>>>>> 
>>>>> Hopefully it is just a MAC Address issue, what would the
>>> IntegrityChecker
>>>>> be looking for?
>>>>> 
>>>>> Thanks,
>>>>> Marty
>>>>> 
>>>>> 
>>>>> On Tue, Oct 22, 2013 at 10:02 PM, Darren Shepherd <
>>>>> darren.s.shepherd@gmail.com> wrote:
>>>>> 
>>>>>> Do you have a specific error from a log?  I was not aware that
>>>>>> CloudStack would look for interfaces w/ eth*, em*.  In the code it
>>>>>> just does "ifconfig -a" to list the devices.  By creating a bond,
the
>>>>>> mac address CloudStack finds will probably change then I could imagine
>>>>>> something could possibly fail.
>>>>>> 
>>>>>> Darren
>>>>>> 
>>>>>> On Tue, Oct 22, 2013 at 1:39 PM, Marty Sweet <msweet.dev@gmail.com>
>>>>>> wrote:
>>>>>>> Hi Guys.
>>>>>>> 
>>>>>>> I am planning on upgrading my 4.1.1 infrastructure to 4.2 over
the
>>>>>> weekend.
>>>>>>> 
>>>>>>> When testing my 4.1.1 setup I ran across a problem where a TOR
>>> switch
>>>>>>> failure would cause an outage to the management server. The agents
>>> use 2
>>>>>>> NICs for all management traffic using bonds.
>>>>>>> When I tried to configure the management server to use a bond0
in
>>> simple
>>>>>>> active-passive mode (like I use for my agent management network),
>>>>>>> cloudstack-management would not start due to 'Integrity Issues',
>>> which
>>>>>> at
>>>>>>> the time I located back to a IntegitryChecker which ensures the
>>>>>> interfaces
>>>>>>> of eth* em* or some others were taking the IP of management server.
>>>>>>> 
>>>>>>> My question is does this limitation still exist and if so, can
it be
>>>>>>> overcome by adding bond* to the list of allowed interface names
and
>>>>>>> compiling the management server from source?
>>>>>>> I would love to hear input to this, it seems bizarre to me that
it
>>> is
>>>>>>> difficult to add simple but effective network redundancy to the
>>>>>> management
>>>>>>> server.
>>>>>>> 
>>>>>>> For scenario basis, this is the basic redundant network setup
I have
>>>>>> for my
>>>>>>> Agents:
>>>>>>> 4x KVM Hosts all with 4 NICs - 2 bonds (Private/Public Traffic)
>>>>>>> 
>>>>>>> Example Host:
>>>>>>> ------------------Interconnect---------------
>>>>>>>      TOR 1      ---------      TOR 2
>>>>>>> ---------------------          ---------------------
>>>>>>>          |      Management      |
>>>>>>>          |     Tagged VLANs    |
>>>>>>> ----------------------------------------------------
>>>>>>>       KVM Cloudstack Hypervisor
>>>>>>> ----------------------------------------------------
>>>>>>>          |      Public Traffic         |
>>>>>>>          |      Tagged VLANS     |
>>>>>>>          |      LACP Aggregation |
>>>>>>> ----------------------------------------------------
>>>>>>>                Core Router
>>>>>>> ----------------------------------------------------
>>>>>>> 
>>>>>>> There are also LACP links with STP rules between the TOR switches
>>> are
>>>>>> the
>>>>>>> core device to allow for interconnect failure so the TORs do
not
>>> become
>>>>>>> isolated, but I have excluded that for simplicity.
>>>>>>> 
>>>>>>> 
>>>>>>> I would have thought it would be easy to create a bond for my
>>> management
>>>>>>> node and connect the two NICs to both the TOR switches, but that
>>> didn't
>>>>>>> work in 4.1.1 due to my reasons above.
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> Marty
>> 
>> 

Mime
View raw message