Ok, So from what I could tell, both the routers were blocking ICMP on the WAN port.
Going through the management ports I was able to enable a feature so both routers can be ping'ed
on their WAN interface.
I also messed around with the ping command to see if packets of certain sizes could be sent/received.
The results were as follows:
>From To Packet Size Result
Internet Root Router 1473 Reply
Internet Root Router 1507 Reply
LAN 1 Internet 1473 Reply
LAN 1 Internet 1507 Reply
LAN 2 (XenServer) Internet 1473 Reply
LAN 2 (XenServer) Internet 1507 Reply
LAN 2 (SSVM) Internet 1400 Reply
LAN 2 (SSVM) Internet 1473 Failure
LAN 2 (SSVM)
Internet 1507 Failure
LAN 1 2nd Router 1473 Reply
LAN 1
2nd Router
1507 Reply
Based on this information I think the problem is at the SSVM level. I am not sure why the
SSVM / Hypervisor level. I am not sure why the hypervisor hosting the VM is able to receive
a ping of size 1507 but the VM it is hosting cannot.
Why would the MTU be different for the VM?
My second question is why is the PMTUD protocol not working. If all the points along the hop
are transmitting pings why can the wget command not resize frames?
I am stumped!
Thanks,
Taylor
________________________________
From: Taylor <tschneider@live.com>
Sent: Wednesday, July 19, 2017 12:23 PM
To: users@cloudstack.apache.org
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
I reread the wiki article and some other google articles.
I think I understand the ICMP issue now: If the router is blocking all ICMP then it will not
receive the (ICMP) Fragmentation Needed (Type 3, Code 4) message containing the MTU of the
other node on the network with the smaller MTU.
Going back to my last question: I think the problem is that ICMP is blocked on the WAN port
of the router?? I think that would prevent the NAT traversal of the ICMP. I am thinking it
is enabled on the LAN port because I can ping google. Is this correct thinking?
I am looking up how to spoof ICMP messages to debug this further.
Thanks,
Taylor
________________________________
From: Taylor <tschneider@live.com>
Sent: Wednesday, July 19, 2017 11:36 AM
To: users@cloudstack.apache.org
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Si,
Thanks for explaining that. Yes, it makes sense.
My two routers are netgear and dd-wrt.
Does the ICMP need to be enabled on both or just the NAT'd router (dd-wrt) ?
I am looking into the router settings / config pages now to get more familiar with what options
are available.
Thanks,
Taylor
________________________________
From: Simon Weller <sweller@ena.com.INVALID>
Sent: Wednesday, July 19, 2017 11:03 AM
To: users@cloudstack.apache.org
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
MTU path discovery uses ICMP type 3 code 4 messages. If your routers are blocking all ICMP
inbound from the internet, they will never received the messages and won't know that an upstream
router needs the packet to be re-transmitted with a smaller MTU size.
Does that make sense?
- Si
________________________________
From: Taylor <tschneider@live.com>
Sent: Wednesday, July 19, 2017 9:37 AM
To: users@cloudstack.apache.org
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
I don't think ICMP is being blocked. I can ping google.com from inside the NAT'd LAN.
Am I misunderstanding what you said?
________________________________
From: Simon Weller <sweller@ena.com.INVALID>
Sent: Wednesday, July 19, 2017 10:04 AM
To: users@cloudstack.apache.org
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
So if your routers are blocking all ICMP, they will break MTU path discovery. See this: https://en.wikipedia.org/wiki/Path_MTU_Discovery
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining
the maximum transmission unit (MTU) size on the network path between two ...
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining
the maximum transmission unit (MTU) size on the network path between two ...
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining
the maximum transmission unit (MTU) size on the network path between two ...
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining
the maximum transmission unit (MTU) size on the network path between two ...
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
Path MTU Discovery - Wikipedia<https://en.wikipedia.org/wiki/Path_MTU_Discovery>
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining
the maximum transmission unit (MTU) size on the network path between two ...
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining
the maximum transmission unit (MTU) size on the network path between two ...
en.wikipedia.org
Path MTU Discovery (PMTUD) is a standardized technique in computer networking for determining
the maximum transmission unit (MTU) size on the network path between two ...
________________________________
From: Taylor <tschneider@live.com>
Sent: Wednesday, July 19, 2017 8:57 AM
To: users@cloudstack.apache.org
Cc: Taylor
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Hi Simon,
I a not sure about the networking problem you mentioned. I will google that and if you have
any quick ways to check let me know.
As for the double NAT, the answer is yes. My network is configured as follows:
Internet -- 1000MBps router -- 100Mbps router -- Hypervisor / Cloudstack / NFS
As far as I am aware the routers are acting as firewalls for incoming traffic (they are simple
home routers) but should not impact outgoing traffic.
Thanks,
Taylor
________________________________
From: Simon Weller <sweller@ena.com.INVALID>
Sent: Wednesday, July 19, 2017 9:41 AM
To: users@cloudstack.apache.org
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Taylor,
To me this sounds like you might have some sort of networking problem, such as MTU path discovery
being broken and some device in the path setting a do not fragment flag.
Can you give us a bit more info about how you are connected to the internet as Dag has suggested?
Is there a firewall in front of your switch? Are you double NATing the traffic?
- Si
________________________________
From: Taylor <tschneider@live.com>
Sent: Wednesday, July 19, 2017 8:09 AM
To: users@cloudstack.apache.org
Cc: Taylor
Subject: RE: Cloudstack 4.9 CentOS Template failure - Connection Reset
forgot to cc myself
-------- Original message --------
From: Taylor <tschneider@live.com>
Date: 7/19/17 08:08 (GMT-06:00)
To: users@cloudstack.apache.org
Subject: RE: Cloudstack 4.9 CentOS Template failure - Connection Reset
Hey Dag,
I have tried both. The health check is good and the vm behavior does not change after recreation.
I think the problem is the network latency or the vm's resource allocation.
The download will work but a connection timeout occurs every 20MB so it needs to be done in
pieces. The hypervisors which hosts the vm is able to download without a problem.
I am on a 100mbps switch. In the past I was using a 1000mbps.
Any other thoughts on debug or work around?
I think adding retry logic should be a simple fix?
-------- Original message --------
From: Dag Sonstebo <Dag.Sonstebo@shapeblue.com>
Date: 7/19/17 03:11 (GMT-06:00)
To: users@cloudstack.apache.org
Subject: Re: Cloudstack 4.9 CentOS Template failure - Connection Reset
Hi Taylor,
This is most likely an issue with your environment rather than a bug. Take a look at your
public network and how that is connected to the internet. You have to let CloudStack pull
down the template, it’s difficult to manually populate this.
A couple of other things to try:
- recreate the SSVM – simply delete it and CloudStack will generate a new one.
- from internally in the SSVM you can also run the SSVM check script, which will do some basic
health checks for you:
root@s-2-VM:~# /usr/local/cloud/systemvm/ssvm-check.sh
================================================
First DNS server is 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 48 data bytes
56 bytes from 8.8.8.8: icmp_seq=0 ttl=53 time=24.146 ms
56 bytes from 8.8.8.8: icmp_seq=1 ttl=53 time=22.320 ms
--- 8.8.8.8 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 22.320/23.233/24.146/0.913 ms
Good: Can ping DNS server
================================================
Good: DNS resolves download.cloud.com
================================================
nfs is currently mounted
Mount point is /mnt/SecStorage/a833f5f1-1c6d-3e54-9a55-1fc9b7875c54
Good: Can write to mount point
================================================
Management server is 10.10.45.2. Checking connectivity.
Good: Can connect to management server port 8250
================================================
Good: Java process is running
================================================
Tests Complete. Look for ERROR or WARNING above.
Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue
On 19/07/2017, 05:29, "Taylor" <tschneider@live.com> wrote:
Hello,
I am experiencing an issue while trying to download the CentOS template.
It seems the connection is timing out and then failing.
To debug I logged into the SSVM and tried running a wget from the nfs directory mounted
on that vm. This also failed due to connection reset.
Wget will eventually succeed if i use retry logic as follows:
wget http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2 --read-timeout=10
Using this setting the download will complete in pieces after multiple timeouts (logs
below).
Is there a work around to add retry logic? Can I manually download and add the template
to the database and restart the service? How can I file a bug report?
Thanks,
Taylor
=============================================================================
LOGS:
=============================================================================
root@s-103-VM:/mnt/SecStorage/5d8e791e-01cc-3d7c-84d8-f469944056e0/template/tmpl/1/5#
wget http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2 --read-timeout=10
--2017-07-19 03:25:10-- http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Resolving download.cloud.com (download.cloud.com)... 54.231.81.40
Connecting to download.cloud.com (download.cloud.com)|54.231.81.40|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 374730926 (357M) [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
19% [============================>
] 74,734,204 --.-K/s
in 97s
2017-07-19 03:26:55 (756 KB/s) - Read error at byte 74734204/374730926 (Connection timed
out). Retrying.
--2017-07-19 03:26:56-- (try: 2) http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|54.231.81.40|:80... failed: Connection
timed out.
Resolving download.cloud.com (download.cloud.com)... 52.216.81.16
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 299996722 (286M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
24% [+++++++++++++++++++++++++++++======>
] 93,094,824 --.-K/s
in 24s
2017-07-19 03:28:23 (740 KB/s) - Read error at byte 93094824/374730926 (Connection timed
out). Retrying.
--2017-07-19 03:28:25-- (try: 3) http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 281636102 (269M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
29% [++++++++++++++++++++++++++++++++++++======>
] 112,111,876 --.-K/s
in 26s
2017-07-19 03:29:23 (702 KB/s) - Read error at byte 112111876/374730926 (Connection timed
out). Retrying.
--2017-07-19 03:29:26-- (try: 4) http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 262619050 (250M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
34% [+++++++++++++++++++++++++++++++++++++++++++=======>
] 130,790,615 --.-K/s
in 26s
2017-07-19 03:30:23 (712 KB/s) - Read error at byte 130790615/374730926 (Connection timed
out). Retrying.
--2017-07-19 03:30:27-- (try: 5) http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 243940311 (233M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
49% [+++++++++++++++++++++++++++++++++++++++++++++++++++====================>
] 185,802,362 --.-K/s
in 49s
2017-07-19 03:31:23 (1.08 MB/s) - Read error at byte 185802362/374730926 (Connection timed
out). Retrying.
--2017-07-19 03:31:28-- (try: 6) http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.81.16|:80... failed: Connection
timed out.
Resolving download.cloud.com (download.cloud.com)... 52.216.225.48
Connecting to download.cloud.com (download.cloud.com)|52.216.225.48|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 188928564 (180M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
60% [++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++================>
] 226,947,877 --.-K/s in 32s
2017-07-19 03:33:11 (1.21 MB/s) - Read error at byte 226947877/374730926 (Connection timed
out). Retrying.
--2017-07-19 03:33:17-- (try: 7) http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.225.48|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 147783049 (141M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
65% [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++======>
] 247,259,880 --.-K/s in 28s
2017-07-19 03:33:46 (697 KB/s) - Read error at byte 247259880/374730926 (Connection timed
out). Retrying.
--2017-07-19 03:33:53-- (try: 8) http://download.cloud.com/templates/builtin/centos56-x86_64.vhd.bz2
Connecting to download.cloud.com (download.cloud.com)|52.216.225.48|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 374730926 (357M), 127471046 (122M) remaining [binary/octet-stream]
Saving to: `centos56-x86_64.vhd.bz2'
100%[++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++==================================================>]
374,730,926 1010K/s in 2m 10s
Dag.Sonstebo@shapeblue.com
www.shapeblue.com<http://www.shapeblue.com>
[http://www.shapeblue.com/wp-content/uploads/2017/06/logo.png]<http://www.shapeblue.com/>
Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
www.shapeblue.com
Rapid deployment framework for Apache CloudStack IaaS Clouds. CSForge is a framework developed
by ShapeBlue to deliver the rapid deployment of a standardised ...
53 Chandos Place, Covent Garden, London WC2N 4HSUK
@shapeblue
|