cloudstack-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Yadav <rohit.ya...@shapeblue.com>
Subject Re: [4.11] Management to VR connection issues
Date Mon, 26 Feb 2018 11:41:23 GMT
Hi Rene,


- I think on the general issue of slow iptables rules application, we need to fix that. Does
it help to increase aggregation timeouts?


- If waiting for ssh and apache2 as part of post-init solves the issue, this would require
a new systemvmtemplate as the systemd scripts cannot be changed or make effect during first
boot.


- I think the additional nics always used to show up for vmware, there is a global setting
to configure this (extra nics for vmware, probably because older versions did not support
dynamic nic addition on vmware vrs).


- For VR timeouts, see logs and check if from management server host you're able to SSH into
the VR using the private IP and port 3922. See the troubleshooting wiki: https://cwiki.apache.org/confluence/display/CLOUDSTACK/SSVM%2C+templates%2C+Secondary+storage+troubleshooting


- Can you share/check which processes are consuming the RAM, 256MB ram is usually enough for
non-redundant VRs. (share output of top or check using htop?). Make sure to use a latest Linux
version (any Debian variant such as Debian 8, 9 or Ubuntu 16.04+ may also work). The issue
is vCenter/ESXi 6.5 for some reason, gives lower RAM compared to 6.0 and 5.5 and has poor
support for legacy os. I had faced/found this issue while testing redundant VRs which take
more RAM usually than normal VRs.


- Rohit

<https://cloudstack.apache.org>



________________________________
From: Rene Moser <mail@renemoser.net>
Sent: Monday, February 26, 2018 11:22:27 AM
To: users@cloudstack.apache.org; dev@cloudstack.apache.org
Subject: Re: [4.11] Management to VR connection issues

Hi again

We found the main problem.

== cloud-postinit hang

When having many iptables rules resulting in cloud-postinit to hang for
10min unless it was killed by systemd. As a result the ssh daemon was
not started for 10 min because it is configured to be started after
cloud-postinit.

It seems the issue was already fixed by
https://github.com/apache/cloudstack/commit/ce67726c6d3db6e7db537e76da6217c5d5f4b10e

== VR still needs manual reboot

However, we still notice adapter changes after a reboot: see before
after screenshots of "ip addr" in
https://photos.app.goo.gl/9XsjOJjLqQ9SRjYV2. We still need to manually
reboot the VR to make the network actually working.

== VR has too many adapters?

Next thing we noticed there are many network adapters (NICs) for this
non-vpc router (see screenshot of the vcenter in
https://photos.app.goo.gl/9XsjOJjLqQ9SRjYV2). Adapter 4 and 5 seem
unnecessary. Any comments on that?

== VR with 256 MB RAM dows not work

Next issue we found is, that the VR must have more than 256MB RAM.
Otherwise systemd will complain the daemon can not be reloaded, because
the ram disk of /run has too less space.

Feb 23 16:24:36 r-413-VM postinit.sh[1089]: Failed to reload daemon:
Refusing to reload, not enough space available on /run/systemd.
Currently, 8.6M are free, but a safety buffer of 16.0M is enforced.
root@r-413-VM:~# df -h /run/
Filesystem      Size  Used Avail Use% Mounted on
tmpfs            16M  7.2M  8.7M  46% /run

Increaing to 512MB RAM helped:

root@r-413-VM:~# df -h /run/
Filesystem      Size  Used Avail Use% Mounted on
tmpfs            41M  7.8M   34M  19% /run

Unsure if this can be tuned on systemd level, didn't find a way yet.

== VR API Command timeouts

When executing command related to VR, e.g. restart network, start/stop
router the command won't reach the vcenter api, and times out. We are
unsure yet, why.

== VR minor fixes

Next we fixed 2 minor things along.

* rsyslogd config syntax issue
* IMHO we should start apache2 also after cloud-postinit

Also see https://github.com/apache/cloudstack/pull/2468

Regards
René

rohit.yadav@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message