cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cs user <acldstk...@gmail.com>
Subject Re: Virtual Routers not responding to dns requests
Date Mon, 30 Mar 2015 14:44:32 GMT
Hi All,

Looking at the code.... The following functions appear to have changed
between 4.3 and 4.4, which seem to relate to vif's, could these changes be
responsible for the incorrect iptable config which is generated?

getLowestAvailableVIFDeviceNum
setupLinkLocalNetwork

Within: cloudstack/4.4/plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/CitrixResourceBase.java

Thanks!



On Sat, Mar 28, 2015 at 9:29 AM, cs user <acldstkusr@gmail.com> wrote:

> Hi Jayapal,
>
> Many thanks for getting back to me. Will there be a way for us to fix this
> ourselves then, or will we have to wait for an update? We are able to work
> around the problem in the following way, but this is no real solution.
>
> On the xenhost where the router spins up, we can see the following vifs
>
>
>> vif35.0   Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>>          UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>>          RX packets:524 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:717 errors:0 dropped:0 overruns:0 carrier:0
>>          collisions:0 txqueuelen:32
>>          RX bytes:31960 (31.2 KiB)  TX bytes:55718 (54.4 KiB)
>>
>> vif35.1   Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>>          UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>>          RX packets:322 errors:0 dropped:0 overruns:0 frame:0
>>          TX packets:378 errors:0 dropped:1 overruns:0 carrier:0
>>          collisions:0 txqueuelen:32
>>          RX bytes:56138 (54.8 KiB)  TX bytes:63900 (62.4 KiB)
>
>
>
> However, the iptable rules which have been created reference vif34.0 (as
> below). If we delete these and recreate them reference vif35 everything
> then works fine. We are able to query against the router from our
> instances.
>
>>
>> Chain BRIDGE-FIREWALL (1 references)
>> target     prot opt source               destination
>> BRIDGE-DEFAULT-FIREWALL  all  --  anywhere             anywhere
>> r-10900-VM  all  --  anywhere             anywhere            PHYSDEV
>> match --physdev-in vif34.0 --physdev-is-bridged
>> r-10900-VM  all  --  anywhere             anywhere            PHYSDEV
>> match --physdev-out vif34.0 --physdev-is-bridged
>>
>> Chain r-10900-VM (4 references)
>> target     prot opt source               destination
>> RETURN     all  --  anywhere             anywhere            PHYSDEV
>> match --physdev-in vif34.0 --physdev-is-bridged
>> RETURN     all  --  anywhere             anywhere            PHYSDEV
>> match --physdev-in vif34.1 --physdev-is-bridged
>> ACCEPT     all  --  anywhere             anywhere
>
>
>
> Is anyone using 4.4.2 with Xen, in basic networking and not having
> problems? If that is the case then perhaps we can fix this with amending
> the database, rather than having to wait for a bugfix to be released.
>
>
> This is how our networks table looks:
>
> mysql> select
>>> id,name,traffic_type,network_domain,guest_type,specify_ip_ranges,
>>> broadcast_uri, broadcast_domain_type,guru_name,acl_type from networks;
>>
>>
>>> +-----+-------------+--------------+------------------------------+------------+-------------------+-----------------+-----------------------+---------------------------+----------+
>>
>> | id  | name        | traffic_type | network_domain               |
>>> guest_type | specify_ip_ranges | broadcast_uri   | broadcast_domain_type |
>>> guru_name                 | acl_type |
>>
>>
>>> +-----+-------------+--------------+------------------------------+------------+-------------------+-----------------+-----------------------+---------------------------+----------+
>>
>> | 200 | NULL        | Public       | NULL                         | NULL
>>>       |                 1 | NULL            | Vlan                  |
>>> PublicNetworkGuru         | NULL     |
>>
>> | 201 | NULL        | Management   | NULL                         | NULL
>>>       |                 0 | NULL            | Native                |
>>> PodBasedNetworkGuru       | NULL     |
>>
>> | 202 | NULL        | Control      | NULL                         | NULL
>>>       |                 0 | NULL            | LinkLocal             |
>>> ControlNetworkGuru        | NULL     |
>>
>> | 203 | NULL        | Storage      | NULL                         | NULL
>>>       |                 1 | NULL            | Native                |
>>> StorageNetworkGuru        | NULL     |
>>
>> | 204 | NP01        | Guest        | np01.domain.internal         |
>>> Shared     |                 1 | vlan://untagged | Vlan                  |
>>> DirectPodBasedNetworkGuru | Domain   |
>>
>> | 205 | NULL        | Public       | NULL                         | NULL
>>>       |                 1 | NULL            | Vlan                  |
>>> PublicNetworkGuru         | NULL     |
>>
>> | 206 | NULL        | Management   | NULL                         | NULL
>>>       |                 0 | NULL            | Native                |
>>> PodBasedNetworkGuru       | NULL     |
>>
>> | 207 | NULL        | Control      | NULL                         | NULL
>>>       |                 0 | NULL            | LinkLocal             |
>>> ControlNetworkGuru        | NULL     |
>>
>> | 208 | NULL        | Storage      | NULL                         | NULL
>>>       |                 1 | NULL            | Native                |
>>> StorageNetworkGuru        | NULL     |
>>
>> | 209 | P02         | Guest        | p02.domain.internal          |
>>> Shared     |                 1 | vlan://untagged | Vlan                  |
>>> DirectPodBasedNetworkGuru | Domain   |
>>
>> | 210 | NULL        | Public       | NULL                         | NULL
>>>       |                 1 | NULL            | Vlan                  |
>>> PublicNetworkGuru         | NULL     |
>>
>> | 211 | NULL        | Management   | NULL                         | NULL
>>>       |                 0 | NULL            | Native                |
>>> PodBasedNetworkGuru       | NULL     |
>>
>> | 212 | NULL        | Control      | NULL                         | NULL
>>>       |                 0 | NULL            | LinkLocal             |
>>> ControlNetworkGuru        | NULL     |
>>
>> | 213 | NULL        | Storage      | NULL                         | NULL
>>>       |                 1 | NULL            | Native                |
>>> StorageNetworkGuru        | NULL     |
>>
>> | 214 | NP01        | Guest        | np01.domain.internal         |
>>> Shared     |                 1 | vlan://untagged | Vlan                  |
>>> DirectPodBasedNetworkGuru | Domain   |
>>
>> | 215 | NULL        | Public       | NULL                         | NULL
>>>       |                 1 | NULL            | Vlan                  |
>>> PublicNetworkGuru         | NULL     |
>>
>> | 216 | NULL        | Management   | NULL                         | NULL
>>>       |                 0 | NULL            | Native                |
>>> PodBasedNetworkGuru       | NULL     |
>>
>> | 217 | NULL        | Control      | NULL                         | NULL
>>>       |                 0 | NULL            | LinkLocal             |
>>> ControlNetworkGuru        | NULL     |
>>
>> | 218 | NULL        | Storage      | NULL                         | NULL
>>>       |                 1 | NULL            | Native                |
>>> StorageNetworkGuru        | NULL     |
>>
>> | 219 | P01         | Guest        | p01.domain.internal          |
>>> Shared     |                 1 | vlan://untagged | Vlan                  |
>>> DirectPodBasedNetworkGuru | Domain   |
>>
>>
>>> +-----+-------------+--------------+------------------------------+------------+-------------------+-----------------+-----------------------+---------------------------+----------+
>>
>> 20 rows in set (0.00 sec)
>>
>>
>>
> In the nics table, when routers used to work, their nics appeared as
> follows:
>
> mysql> select
>>> id,created,default_nic,display_nic,broadcast_uri,isolation_uri from nics
>>> where vm_type="DomainRouter" and mode='Dhcp' and id='63605';
>>
>>
>>> +-------+---------------------+-------------+-------------+-----------------+----------------+
>>
>> | id    | created             | default_nic | display_nic | broadcast_uri
>>>   | isolation_uri  |
>>
>>
>>> +-------+---------------------+-------------+-------------+-----------------+----------------+
>>
>> | 63605 | 2015-03-11 12:15:09 |           1 |           1 |
>>> vlan://untagged | ec2://untagged |
>>
>>
>>> +-------+---------------------+-------------+-------------+-----------------+----------------+
>>
>> 1 row in set (0.00 sec)
>>
>>
>>
>
> However now the nic looks like this when created:
>
> mysql> select
>> id,created,default_nic,display_nic,broadcast_uri,isolation_uri from nics
>> where vm_type="DomainRouter" and mode='Dhcp' and id='70001';
>>
>> +-------+---------------------+-------------+-------------+-----------------+---------------+
>> | id    | created             | default_nic | display_nic | broadcast_uri
>>   | isolation_uri |
>>
>> +-------+---------------------+-------------+-------------+-----------------+---------------+
>> | 70001 | 2015-03-27 16:55:22 |           1 |           1 |
>> vlan://untagged | NULL          |
>>
>> +-------+---------------------+-------------+-------------+-----------------+---------------+
>> 1 row in set (0.00 sec)
>
>
>
> Are there any other tables or logs I can look at to help ?
>
>
> On Fri, Mar 27, 2015 at 5:28 PM, Jayapal Reddy Uradi <
> jayapalreddy.uradi@citrix.com> wrote:
>
>> While preparing the nic for VM, depending on which network  address space
>> (public/guest) vlan the isolation_uri is set.
>> If there is no vlan for the public network, then VR public nic
>> isolation_uri will be untagged.
>>
>> Thanks,
>> Jayapal
>>
>> On 27-Mar-2015, at 9:01 PM, cs user <acldstkusr@gmail.com>
>>  wrote:
>>
>> > Just to add, in the database, the  nics table:
>> >
>> > isolation_uri used to be set to "ec2://untagged"
>> > for vm_type="DomainRouter" of mode Dhcp.
>> >
>> > This is no longer happening however, they are just set as NULL. I don't
>> see
>> > a way to set this in the networks table however. How does this
>> > isolation_uri column get set, is it from the database or hard coded in
>> the
>> > code?
>> >
>> >
>> >
>> >
>> >
>> > On Fri, Mar 27, 2015 at 10:02 AM, cs user <acldstkusr@gmail.com> wrote:
>> >
>> >> Hi Again... :-)
>> >>
>> >> So it looks like the vif on the xenserver is not being setup correctly
>> (or
>> >> has been removed). The following rule is defined on xen host which the
>> >> broken router is running on:
>> >>
>> >> Chain r-10864-VM (4 references)
>> >>>
>> >>> target     prot opt source               destination
>> >>>
>> >>> RETURN     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-in vif718.0 --physdev-is-bridged
>> >>>
>> >>> RETURN     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-in vif718.1 --physdev-is-bridged
>> >>>
>> >>> ACCEPT     all  --  anywhere             anywhere
>> >>>
>> >>>
>> >>>>
>> >> The vif that is mentioned above is not present on the host. As below.
>> But
>> >> on the working router, the vif in the iptable rule does exist.
>> >>
>> >>
>> >> On the host, we also see the following in the logs with the vif
>> mentioned:
>> >>
>> >> Mar 26 14:04:31 xen011 script-vif: vif718.0: writing
>> >> backend/vif/718/0/hotplug-status=connected
>> >> Mar 26 14:04:31 xen011 scripts-vif: Setting vif718.1 MTU 1500
>> >> Mar 26 14:04:31 xen011 scripts-vif: Adding vif718.1 to xapi4 with
>> address
>> >> fe:ff:ff:ff:ff:ff
>> >> Mar 26 14:04:31 xen011 scripts-vif: Failed to ip link set vif718.1
>> address
>> >> fe:ff:ff:ff:ff:ff
>> >> Mar 26 14:04:31 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[3233] - ['/sbin/ip', 'link',
>> 'set',
>> >> 'vif718.1', 'down']
>> >> Mar 26 14:04:31 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[3233] - ['/sbin/ebtables', '-L',
>> >> 'FORWARD_vif718.1']
>> >> Mar 26 14:04:31 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[3233] - ['/sbin/ip', 'link',
>> 'set',
>> >> 'vif718.1', 'up']
>> >> Mar 26 14:04:31 xen011 script-vif: vif718.1: writing
>> >> backend/vif/718/1/hotplug-status=connected
>> >> Mar 26 14:05:12 xen011 script-vif: vif718.1: removing
>> >> backend/vif/718/1/hotplug-status
>> >> Mar 26 14:05:12 xen011 script-vif: vif718.1: removing
>> >> /xapi/718/hotplug/vif/1/hotplug
>> >> Mar 26 14:05:12 xen011 scripts-vif: vif718.1 has been removed
>> >> Mar 26 14:05:12 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[4113] - ['/sbin/ip', 'link',
>> 'set',
>> >> 'vif718.1', 'down']
>> >> Mar 26 14:05:12 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[4113] - ['/sbin/ebtables', '-L',
>> >> 'FORWARD_vif718.1']
>> >> Mar 26 14:05:13 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[4113] - ['/sbin/ip', 'link',
>> 'set',
>> >> 'vif718.1', 'up']
>> >> Mar 26 14:05:13 xen011 script-vif: vif718.0: removing
>> >> backend/vif/718/0/hotplug-status
>> >> Mar 26 14:05:13 xen011 script-vif: vif718.0: removing
>> >> /xapi/718/hotplug/vif/0/hotplug
>> >> Mar 26 14:05:13 xen011 scripts-vif: vif718.0 has been removed
>> >> Mar 26 14:05:13 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[4156] - ['/sbin/ip', 'link',
>> 'set',
>> >> 'vif718.0', 'down']
>> >> Mar 26 14:05:13 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[4156] - ['/sbin/ebtables', '-L',
>> >> 'FORWARD_vif718.0']
>> >> Mar 26 14:05:13 xen011 python:
>> >> /opt/xensource/libexec/setup-vif-rules[4156] - ['/sbin/ip', 'link',
>> 'set',
>> >> 'vif718.0', 'up']
>> >> Mar 26 14:05:24 xen011 fe: 4917 (/sbin/ip addr show dev vif718.0)
>> exitted
>> >> with code 255
>> >> Mar 26 14:05:25 xen011 fe: 5062 (/sbin/ip addr show dev vif718.1)
>> exitted
>> >> with code 255
>> >>
>> >>
>> >> List of vif's, 718 is missing now however:
>> >>
>> >>
>> >>> vif477.0  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>> >>>
>> >>>         UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> >>>
>> >>>         RX packets:150128316 errors:0 dropped:0 overruns:0 frame:0
>> >>>
>> >>>         TX packets:163423985 errors:0 dropped:0 overruns:0 carrier:0
>> >>>
>> >>>         collisions:0 txqueuelen:32
>> >>>
>> >>>         RX bytes:598157233 (570.4 MiB)  TX bytes:501933888 (478.6 MiB)
>> >>>
>> >>>
>> >>>> vif671.0  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>> >>>
>> >>>         UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> >>>
>> >>>         RX packets:38112 errors:0 dropped:0 overruns:0 frame:0
>> >>>
>> >>>         TX packets:71566 errors:0 dropped:0 overruns:0 carrier:0
>> >>>
>> >>>         collisions:0 txqueuelen:32
>> >>>
>> >>>         RX bytes:2005682 (1.9 MiB)  TX bytes:92870677 (88.5 MiB)
>> >>>
>> >>>
>> >>>> vif696.0  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>> >>>
>> >>>         UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> >>>
>> >>>         RX packets:20049 errors:0 dropped:0 overruns:0 frame:0
>> >>>
>> >>>         TX packets:49817 errors:0 dropped:0 overruns:0 carrier:0
>> >>>
>> >>>         collisions:0 txqueuelen:32
>> >>>
>> >>>         RX bytes:1215219 (1.1 MiB)  TX bytes:62130987 (59.2 MiB)
>> >>>
>> >>>
>> >>>> vif703.0  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>> >>>
>> >>>         UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> >>>
>> >>>         RX packets:1459 errors:0 dropped:0 overruns:0 frame:0
>> >>>
>> >>>         TX packets:1803 errors:0 dropped:0 overruns:0 carrier:0
>> >>>
>> >>>         collisions:0 txqueuelen:32
>> >>>
>> >>>         RX bytes:48244 (47.1 KiB)  TX bytes:213662 (208.6 KiB)
>> >>>
>> >>>
>> >>>> vif719.0  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>> >>>
>> >>>         UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> >>>
>> >>>         RX packets:1571 errors:0 dropped:0 overruns:0 frame:0
>> >>>
>> >>>         TX packets:75983 errors:0 dropped:2 overruns:0 carrier:0
>> >>>
>> >>>         collisions:0 txqueuelen:32
>> >>>
>> >>>         RX bytes:74416 (72.6 KiB)  TX bytes:3710662 (3.5 MiB)
>> >>>
>> >>>
>> >>>> vif719.1  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>> >>>
>> >>>         UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> >>>
>> >>>         RX packets:7982 errors:0 dropped:0 overruns:0 frame:0
>> >>>
>> >>>         TX packets:8513 errors:0 dropped:1 overruns:0 carrier:0
>> >>>
>> >>>         collisions:0 txqueuelen:32
>> >>>
>> >>>         RX bytes:1349032 (1.2 MiB)  TX bytes:787782 (769.3 KiB)
>> >>>
>> >>>
>> >>>> vif720.0  Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF
>> >>>
>> >>>         UP BROADCAST RUNNING NOARP PROMISC  MTU:1500  Metric:1
>> >>>
>> >>>         RX packets:75 errors:0 dropped:0 overruns:0 frame:0
>> >>>
>> >>>         TX packets:77 errors:0 dropped:0 overruns:0 carrier:0
>> >>>
>> >>>         collisions:0 txqueuelen:32
>> >>>
>> >>>         RX bytes:3404 (3.3 KiB)  TX bytes:5502 (5.3 KiB)
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> On Fri, Mar 27, 2015 at 8:57 AM, cs user <acldstkusr@gmail.com> wrote:
>> >>
>> >>> Hi Jayapal,
>> >>>
>> >>> Those two parameters are both set to 1.
>> >>>
>> >>> We have a router which has survived the upgrade, and is still able to
>> >>> receive and return dns requests from instances. We have checked on the
>> >>> xenserver host and can see the following iptable config:
>> >>>
>> >>> Chain FORWARD (policy ACCEPT)
>> >>>> target     prot opt source               destination
>> >>>> BRIDGE-FIREWALL  all  --  anywhere             anywhere
>> >>>> PHYSDEV match --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth0+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond3+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond0+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth1+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth3+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth6+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond1+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth2+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth5+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth7+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth4+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond2+ --physdev-is-bridged
>> >>>> DROP       all  --  anywhere             anywhere
>> >>>
>> >>>
>> >>>
>> >>> However, on a xenserver which is running a router which is not
>> working,
>> >>> we can see the following:
>> >>>
>> >>> Chain FORWARD (policy ACCEPT)
>> >>>> target     prot opt source               destination
>> >>>> BRIDGE-FIREWALL  all  --  anywhere             anywhere
>> >>>> PHYSDEV match --physdev-is-bridged
>> >>>>
>> >>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth1+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth4+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth3+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth7+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond3+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond0+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth6+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond1+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth5+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond2+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth0+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth2+ --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth1 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth4 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth3 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth7 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond3 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth6 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond1 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth5 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond2 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth0 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out eth2 --physdev-is-bridged
>> >>>> ACCEPT     all  --  anywhere             anywhere            PHYSDEV
>> >>>> match --physdev-out bond0 --physdev-is-bridged
>> >>>> DROP       all  --  anywhere             anywhere
>> >>>
>> >>>
>> >>> It seems the config is duplicated but without the pluses. I think
>> these
>> >>> are similar to wildcard entries?
>> >>>
>> >>> Cheers!
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Mar 27, 2015 at 8:46 AM, Jayapal Reddy Uradi <
>> >>> jayapalreddy.uradi@citrix.com> wrote:
>> >>>
>> >>>> Silly question but Is your xenserver configured bridge mode related
>> >>>> settings correctly ?
>> >>>>
>> >>>> #xe-switch-network-backend bridge
>> >>>> #echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
>> >>>> #echo 1 > /proc/sys/net/bridge/bridge-nf-call-arptables
>> >>>>
>> >>>> Thanks,
>> >>>> Jayapal
>> >>>> On 27-Mar-2015, at 1:50 PM, cs user <acldstkusr@gmail.com<mailto:
>> >>>> acldstkusr@gmail.com>> wrote:
>> >>>>
>> >>>> Hi Somesh,
>> >>>>
>> >>>> arping looks good, the correct mac address is displayed and we get
a
>> >>>> unicast reply from the ip address.
>> >>>>
>> >>>> Erik, tried restarting dnsmasq, all looks fine. VR is able to perform
>> >>>> outgoing dns requests. There is nothing in syslog/dnsmasq logs that
>> I can
>> >>>> see. No egress rules are in place. The system vm's are able to
>> perform
>> >>>> dig's against google's dns, but not the virtual router. It seems
it
>> is
>> >>>> being blocked at the xen level.
>> >>>>
>> >>>> We're seeing the below in the logs when restarting a network (either
>> >>>> ticking clear config or not). This appears to be similar to :
>> >>>>
>> >>>> https://issues.apache.org/jira/browse/CLOUDSTACK-7605
>> >>>>
>> >>>> We are using basic zones, some have multiple pods, others don't.
We
>> see
>> >>>> the
>> >>>> same error in both. The routers come up though and go green, and
>> dnsmasq
>> >>>> is
>> >>>> populated with the relevant info. DNS lookups work locally on the
>> router,
>> >>>> just not remotely. DHCP is working for new machines which get spun
>> up.
>> >>>>
>> >>>> Is there a way to debug this? I've checked the logs on the router
>> >>>> (cloud.log) can't see any errors in there.
>> >>>>
>> >>>> 2015-03-27 08:12:45,081 DEBUG [o.a.c.e.o.NetworkOrchestrator]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e)
>> Implementing
>> >>>> the
>> >>>> network Ntwk[9f5655bf-3101-45d9-83eb-d9061eadc2bb|Guest|47] elements
>> and
>> >>>> resources as a part of network restart
>> >>>>
>> >>>> 2015-03-27 08:12:45,096 DEBUG [o.a.c.e.o.NetworkOrchestrator]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e) Asking
>> >>>> SecurityGroupProvider to implemenet
>> >>>> Ntwk[9f5655bf-3101-45d9-83eb-d9061eadc2bb|Guest|47]
>> >>>>
>> >>>> 2015-03-27 08:12:45,103 DEBUG [o.a.c.e.o.NetworkOrchestrator]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e) Asking
>> >>>> VirtualRouter to implemenet
>> >>>> Ntwk[9f5655bf-3101-45d9-83eb-d9061eadc2bb|Guest|47]
>> >>>>
>> >>>> 2015-03-27 08:12:45,112 DEBUG
>> >>>> [c.c.n.r.VirtualNetworkApplianceManagerImpl]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e) Lock
is
>> >>>> acquired
>> >>>> for network id 204 as a part of router startup in
>> >>>>
>> >>>>
>> Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
>> >>>> : Dest[Zone(8)-Pod(null)-Cluster(null)-Host(null)-Storage()]
>> >>>>
>> >>>> 2015-03-27 08:12:45,119 DEBUG
>> >>>> [c.c.n.r.VirtualNetworkApplianceManagerImpl]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e) Skipping
>> VR
>> >>>> deployment: Found a running or starting VR in Pod null id=8
>> >>>>
>> >>>> 2015-03-27 08:12:45,120 DEBUG
>> >>>> [c.c.n.r.VirtualNetworkApplianceManagerImpl]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e) Lock
is
>> >>>> released
>> >>>> for network id 204 as a part of router startup in
>> >>>>
>> >>>>
>> Dest[Zone(Id)-Pod(Id)-Cluster(Id)-Host(Id)-Storage(Volume(Id|Type-->Pool(Id))]
>> >>>> : Dest[Zone(8)-Pod(null)-Cluster(null)-Host(null)-Storage()]
>> >>>>
>> >>>> 2015-03-27 08:12:45,123 WARN  [o.a.c.e.o.NetworkOrchestrator]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e) Failed
to
>> >>>> implement network Ntwk[9f5655bf-3101-45d9-83eb-d9061eadc2bb|Guest|47]
>> >>>> elements and resources as a part of network restart due to
>> >>>>
>> >>>> java.lang.NullPointerException
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> com.cloud.network.element.VirtualRouterElement.getRouters(VirtualRouterElement.java:952)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> com.cloud.network.element.VirtualRouterElement.prepareAggregatedExecution(VirtualRouterElement.java:1099)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.implementNetworkElementsAndResources(NetworkOrchestrator.java:1090)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.engine.orchestration.NetworkOrchestrator.restartNetwork(NetworkOrchestrator.java:2430)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> com.cloud.network.NetworkServiceImpl.restartNetwork(NetworkServiceImpl.java:1892)
>> >>>>
>> >>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> >>>>
>> >>>>       at java.lang.reflect.Method.invoke(Method.java:601)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:106)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:161)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:91)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
>> >>>>
>> >>>>       at $Proxy156.restartNetwork(Unknown Source)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNetworkCmd.java:95)
>> >>>>
>> >>>>       at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:141)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:503)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:460)
>> >>>>
>> >>>>       at
>> >>>>
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> >>>>
>> >>>>       at
>> >>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>> >>>>
>> >>>>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> >>>>
>> >>>>       at
>> >>>>
>> >>>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> >>>>
>> >>>>       at java.lang.Thread.run(Thread.java:722)
>> >>>>
>> >>>> 2015-03-27 08:12:45,125 WARN  [c.c.n.NetworkServiceImpl]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235 ctx-77114e2e) Network
>> id=204
>> >>>> failed to restart.
>> >>>>
>> >>>> 2015-03-27 08:12:45,140 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235) Complete async
>> job-189235,
>> >>>> jobStatus: FAILED, resultCode: 530,
>> >>>>
>> >>>>
>> result:org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Failed
>> >>>> to restart network"}
>> >>>>
>> >>>> 2015-03-27 08:12:45,152 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235) Done executing
>> >>>> org.apache.cloudstack.api.command.user.network.RestartNetworkCmd
for
>> >>>> job-189235
>> >>>>
>> >>>> 2015-03-27 08:12:45,158 INFO  [o.a.c.f.j.i.AsyncJobMonitor]
>> >>>> (API-Job-Executor-16:ctx-0b0aa78a job-189235) Remove job-189235
from
>> job
>> >>>> monitoring
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thu, Mar 26, 2015 at 7:38 PM, Somesh Naidu <
>> Somesh.Naidu@citrix.com>
>> >>>> wrote:
>> >>>>
>> >>>> You might want to do a arpping on the router's IP from one of the
>> guests
>> >>>> and see how many records are returned.
>> >>>>
>> >>>> Somesh
>> >>>> CloudPlatform Escalations
>> >>>> Citrix Systems, Inc.
>> >>>>
>> >>>> -----Original Message-----
>> >>>> From: Erik Weber [mailto:terbolous@gmail.com]
>> >>>> Sent: Thursday, March 26, 2015 12:54 PM
>> >>>> To: users@cloudstack.apache.org
>> >>>> Subject: Re: Virtual Routers not responding to dns requests
>> >>>>
>> >>>> I briefly remember having similar problems at some point, but do
not
>> >>>> recall
>> >>>> details as version nor the solution.
>> >>>>
>> >>>> 1) Does it work if you restart dnsmasq on the VR?
>> >>>> 2) is the VR able to do outgoing dns requests?
>> >>>> 3) anything in syslog/dnsmasq logs?
>> >>>> 4) any egress rules in place?
>> >>>>
>> >>>>
>> >>>> Erik
>> >>>>
>> >>>> Den torsdag 26. mars 2015 skrev cs user <acldstkusr@gmail.com>
>> følgende:
>> >>>>
>> >>>> Hi All,
>> >>>>
>> >>>> We have upgraded from 4.3 to 4.4.2.
>> >>>>
>> >>>> After some issues with starting the systemvm's, the virtual routers
>> no
>> >>>> longer responding to dns requests from the vm's which we start (or
>> >>>> existing
>> >>>> ones).
>> >>>>
>> >>>> We have disabled the firewall on the virtual routers and ran tcpdump
>> on
>> >>>> them but we can't see any inbound traffic on port 53 (udp or tcp).
>> If we
>> >>>> log onto the virtual routers and dig locally against eth0 and the
>> alias
>> >>>> on
>> >>>> eth0 both of these return fine with the correct IP.
>> >>>>
>> >>>> This is using Xenserver 6.1 as the host.
>> >>>>
>> >>>> Has anyone come across this before? dhcp lookups appear to be working
>> >>>> fine.
>> >>>> Is there a firewall rule in place on the router vm's (other than
>> >>>> iptables),
>> >>>> similar to the security groups which are applied by Xen, which is
>> >>>> preventing these requests from hitting the routers?
>> >>>>
>> >>>> Many thanks for any help.
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message