cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Ceylani <>
Subject Re: VR not able to ping public gateway for almost 3 hours then it WORKS...
Date Wed, 29 Oct 2014 03:50:52 GMT
We thought about this possibility and truly checked every mac and every ip everything on those
3 switches and everything is clean, we even setup a syslog server and started logging everything
from those 3 switches. Thats why we configured a physical box with same IP address (after
shutting down VR) and checked mac address tables on those switches and we even had stp setup
for redundancy we considered this possibility and removed redundancy and stopped stp removed
reduntant cabling to see if it would help nothing so far pointing to network setup or switch
ports or ip address, we even placed the xenserver on the very switch that internet was connected
and still the same problem of waiting 1-3 hours before it starts pinging...

One thing we realized during those checks was one of our management router, we were using
with DHCP (same management network but ip ranges for cs and from our router was different)
and we realized this was a big mistake since cloudstack was using its own DHCP on the management
network and we disabled that, it is not really related to our public network and it was being
used to issue dhcp ip address to wireless clients (our ipads, laptops etc.) and we thought
different ip ranges would be ok but it wasn't so we disabled it....just a side note.

I would like to also mention one other thing : When we first setup we immediately checked
other system vms (this is our second or third setup due to this public ip problem) SSVM came
up and started pinging right away, then we checked Console proxy and it hesitated for almost
10 minutes and started pinging but it was quick, we have this problem of 1-3 hours timeframe
with only routers. In previous setups we also seen SSVM and Console proxy hesitating for sometime
but came up quicker then routers.

This iptables -L being very slow to respond when it is not working is the very significant
symptom we found so far, we just sit for few hours and issue iptables -L and if it responds
very fast then we know we can ping the gateway...and sure enough we can....


Sam Ceylani, MBA
Computer Engineer
MisterCertified Inc.

301 W. Platt St. Suite 447, Tampa, FL 33606<x-apple-data-detectors://0/0>
P 813<tel:813.264.6460>.264.6460<tel:813.264.6460> M 813<tel:813.416.7867>.416.7867<tel:813.416.7867>
F 800<tel:800.553.9520>.553.9520<tel:800.553.9520> E<>

On Oct 28, 2014, at 11:29 PM, "Philippe Bechamp" <<>>


Have you considered an IP address conflict ?

arping could help you track this down if that could be the case.

No hard data but instinct and internettance scream IP conflict in my brain !

Good luck !

Phil Bechamp | Director of Online Operations
+1.514.812.9609 ext. 222

From: Sam Ceylani [<>]
Sent: Tuesday, October 28, 2014 11:19 PM
Subject: VR not able to ping public gateway for almost 3 hours then it WORKS...

VR not able to ping public gateway for almost 3 hours then it works.

Cloudstack 4.4.1 (new install) and Xenserver 6.2, public and management networks are not tagged
and using vlan1. For some reason when VR is created its not able communicate with its public
gateway for almost 2-3 hours and all of a sudden it starts pinging. After it starts pinging
then restarting VR etc. is not a problem and it starts working as soon as it comes up but
problem happens again when router is destroyed and created again and we have this same problem
of not being able ping gateway for sometime, it takes 30-45 minutes to starts working again
sometimes 2-3 hours. We have 3 HP switches and first one is connected to internet gateway
and through untagged ports  on those 2 other switches (through trunk port) xenserver hosts
connected via (active-passive) bonds. Iscsi (primary storage, vlan 30,31,32,33) nfs (secondary
storage vlan 34), guest (500-550) public (vlan1) management (vlan1). We logon to virtual router
and issue iptables -L and response is very slow (when it starts working response is very fast)
we tried to traceroute gateway ip and response is very fast blank * * * displayed for all
those 30 hops. ifconfig -a displays all the right information for network interfaces.We tried
to remove and reinsert egress rule (ALL) back but that didn't help we would still have to
wait for few hours for router to start pinging again. We tried to use this same IP on a physical
machine connected to this same switch on an untagged port and it works as soon as we configure
this same IP. We can ping this VR from outside and it responds OK so we know that network
configuration is OK, We are thinking about firewall rules not downloading in a timely manner
but we checked /var/log/cloud.log file on the router but there is really no change before
and after (pinging) so we really don't know how to troubleshoot this problem any further...

If requested, I can upload cloud.log file from VR, we compared this log file with a working
one (VR) and no difference between them,

Template file and CS 4.4.1 downloaded around Oct 6,

I know it is hard to troubleshoot this kind of issue but if you can point me to possible causes
that will be perfect so we can start from somewhere to troubleshoot this problem,

When we used tcpdump on the router we realized that before it starts working we have more
stuff displayed (conversations about almost every network activity on the switch) and when
it starts working almost %60 reduction in tcp conversations from all interfaces on the router...



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message