cloudstack-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Yadav <rohit.ya...@shapeblue.com>
Subject Re: SSVM NIO SSL Handshake error
Date Sun, 28 May 2017 07:29:14 GMT
Hi Jason,


In your test environment that uses the same db, can you try to do a workaround-experiment
from [1]:


0) chmod +r and chown cloud:cloud relevant file and locations.


1) Stop Management Server

2) Delete SSH Keys in mysql Database: delete from configuration where name = "ssh.publickey"
; delete from configuration where name = "ssh.privatekey" ;

3) Delete the SSH Keys rm /var/lib/cloudstack/management/.ssh/id_rsa.pub rm /var/lib/cloudstack/management/.ssh/id_rsa

4) Start the Management Server - SSH Keys are generated and mysql entries inserted



[1] http://markmail.org/message/zfjyd7s22itg7t7q


Regards.

________________________________
From: Jason Kinsella <jason@cloudpeople.com.au>
Sent: 27 May 2017 05:33:43
To: users@cloudstack.apache.org
Subject: Re: SSVM NIO SSL Handshake error

Files are linked here.

https://dl.dropboxusercontent.com/u/10588206/acs492/managmenet-server-logs.tar.gz
https://dl.dropboxusercontent.com/u/10588206/acs492/systemvm.tar.gz

Today we did a couple of additional tests that proved interesting. We’ve got a prod and
a dev server. Both were upgraded last month. The prod has the error, but the dev is working.
Everything was the same including CentOS 6.5.

We restored the dev DB into the fresh CentOS7 box and it displayed the same problem. This
would suggest an OS issue. Therefore, the converse should work. We restored the prod DB into
the dev server and it continues to exhibit the problem.

This suggests that we may have missed something in the migration between servers. Here’s
steps:

    Stop cloudstack-man service on broken box
    Dump DB
    Copy to new and restore
    Copy db.properties & key files and update IP entry in db.properties
    Update DB entry host to new IP
    Delete DB ssl.keystore and keystore file
    Destroy systemVMs in Vmware
    Start cloudstack-man on new box

    The /var/cloudstack/management/.ssh/ files are referenced when we ssh to ssvm from MS
so they are correct. What about ssh.public and ssh.private in db.cloud.configuration table?

    Regards,
    Jason

    On 25/5/17, 7:51 pm, "Rohit Yadav" <rohit.yadav@shapeblue.com> wrote:

        Hi Jason,


        Thanks for sharing the details. Yes, with the new setup please share with us the mgmt
server logs and ssvm logs with TRACE enabled in the log4j configuration.


        Regards.

        ________________________________
        From: Jason Kinsella <jason@cloudpeople.com.au>
        Sent: 25 May 2017 12:49:50
        To: users@cloudstack.apache.org
        Subject: Re: SSVM NIO SSL Handshake error

        Hi Rohit,
        API login – fixed.

        Latest systemvmtemplate (shapeblue new) in place – no improvement

        No loadbalancer or known service on MS port 8250

        I am doing my testing now on a fresh install of CentOS7 using shapeblue noredist with
DB restored.

        Hypervisor = vmware vsphere 6.5 with ESX 6.5

        Systemvm.iso is dated today

        All systemvms are exhibiting same behaviour.

        Would any other logs help?

        Regards,
        Jason

        On 25/5/17, 4:55 pm, "Rohit Yadav" <rohit.yadav@shapeblue.com> wrote:

            Hi Jason,


            The API login issue can be fixed by following this, which I believe you have already
fixed: http://docs.cloudstack.apache.org/projects/cloudstack-administration/en/4.9/accounts.html#using-dynamic-roles


            If not already in-use, can you try using the latest systemvmtemplate (for 4.6-4.9)
from http://packages.shapeblue.com/systemvmtemplate/4.6/new.


            Do you have a load-balancer on port 8250 on the management server(s), or any script/service
that may be trying to perform a tcp-connect on mgmt server's port 8250?


            When you upgrade can you make sure that both cloudstack-common and cloudstack-management
packages are upgraded to 4.9.2.0? Also, what hypervisor(s) are you using?


            The following error may hint that the jars on systemvms may not be updated, as
one of the exception classes are missing:


                2017-05-23 11:58:22,468 INFO  [utils.exception.CSExceptionErrorCode] (main:null)
Could not find exception: com.cloud.utils.exception.NioConnectionException in error code list
for exceptions


            Can you check that systemvm.iso are synced across hosts: (1) make sure cloudstack-common
package is upgraded/updated to the same version as cloudstack-management (4.9.2.0), (2) if
you're using vmware, delete this from the secondary storage, (3) for xenserver force reconnect
on the host (from ui/api) or manually copy the scripts to xenserver host(s), (4) for kvm upgrade
the cloudstack-common package.


            Destroy all other systemvms and see if you can reproduce the issue?


            Regards.

            ________________________________
            From: Jason Kinsella <jason@cloudpeople.com.au>
            Sent: 25 May 2017 09:32:25
            To: users@cloudstack.apache.org
            Subject: Re: SSVM NIO SSL Handshake error

            Also, just wanted to mention that the symptoms we have with systemvms not connecting
is described in the mail-list

            CS 4.9 NIO Selector wait time PR-1601 - https://www.mail-archive.com/dev@cloudstack.apache.org/msg69154.html

            The only difference is that this thread refers to KVM hosts not connecting.

            I’ve tried most suggestions in this thread.

            On 25/5/17, 1:51 pm, "Jason Kinsella" <jason@cloudpeople.com.au> wrote:

                Java versions are as follows:

                MS: 1.7.0_141
                SSVM: 1.7.0_85

                Deleted keystore files (again) and restarted MS, then recreated the SSVM.

                Errors from SSVM:/var/log/cloud.log

                2017-05-25 03:01:28,757 INFO  [utils.nio.NioClient] (main:null) Connecting
to 192.168.12.5:8250
                2017-05-25 03:01:29,293 WARN  [utils.nio.Link] (main:null) This SSL engine
was forced to close inbound due to end of stream.
                2017-05-25 03:01:29,293 ERROR [utils.nio.Link] (main:null) Failed to send
server's CLOSE message due to socket channel's failure.
                2017-05-25 03:01:29,294 ERROR [utils.nio.NioClient] (main:null) SSL Handshake
failed while connecting to host: 192.168.12.5 port: 8250
                2017-05-25 03:01:29,294 ERROR [utils.nio.NioConnection] (main:null) Unable
to initialize the threads.
                java.io.IOException: SSL Handshake failed while connecting to host: 192.168.12.5
port: 8250
                     at com.cloud.utils.nio.NioClient.init(NioClient.java:67)
                     at com.cloud.utils.nio.NioConnection.start(NioConnection.java:88)
                     at com.cloud.agent.Agent.start(Agent.java:228)
                     at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:399)
                     at com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:367)
                     at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:351)
                     at com.cloud.agent.AgentShell.start(AgentShell.java:456)
                     at com.cloud.agent.AgentShell.main(AgentShell.java:491)

                Same SSL engine forced to close inbound due to end of stream







                On 25/5/17, 1:51 am, "Rajani Karuturi" <rajani@apache.org> wrote:

                    Can you check java version? Set the default java to 1.7 and delete keystore
                    files and restart MS

                    ~Rajani

                    Sent from phone.

                    On 24 May 2017 9:15 p.m., "Jason Kinsella" <jason@cloudpeople.com.au>
wrote:

                    > I have now moved management server to a fresh CentOS7 server. But
                    > unfortunately I’m getting the exact same SSL handshake error. Back
to
                    > square one.
                    >
                    > On 24/5/17, 11:40 pm, "Jason Kinsella" <jason@cloudpeople.com.au>
wrote:
                    >
                    >     Hi All,
                    >     Based on the feedback it seems like the issue is related to CentOS
                    > version, so I’ve built a new CentOS7 Management server using Blueshape
                    > noredist. I’ve restored the 4.9.2.0 DB into this server and
                    > management-server.logs look clean on boot. The only problem is that
I can’t
                    > log into the webUI.
                    >
                    >     The logs show a successful login (user = kinsja), but the the
API
                    > command either is not allowed or doesn’t exist for the user. This
means the
                    > UI doesn’t load.
                    >
                    >     Anyone seen this with a restored DB?
                    >
                    >     2017-05-24 09:26:08,239 DEBUG [c.c.u.AccountManagerImpl]
                    > (catalina-exec-17:ctx-ee2c5e26) (logid:a8ca5ee5) User: kinsja in
domain 1
                    > has successfully logged in
                    >     2017-05-24 09:26:08,246 INFO  [c.c.a.ApiServer] (catalina-exec-17:ctx-ee2c5e26)
                    > (logid:a8ca5ee5) Current user logged in under  timezone
                    >     2017-05-24 09:26:08,246 INFO  [c.c.a.ApiServer] (catalina-exec-17:ctx-ee2c5e26)
                    > (logid:a8ca5ee5) Timezone offset from UTC is: 0.0
                    >     2017-05-24 09:26:08,251 DEBUG [c.c.a.ApiServlet] (catalina-exec-17:ctx-ee2c5e26)
                    > (logid:a8ca5ee5) ===END===  192.168.10.38 -- POST
                    >     2017-05-24 09:26:08,320 DEBUG [c.c.a.ApiServlet] (catalina-exec-13:ctx-a1d38347)
                    > (logid:3404c663) ===START===  192.168.10.38 -- GET
                    > command=listCapabilities&response=json&_=1495632368256
                    >     2017-05-24 09:26:08,325 DEBUG [c.c.a.ApiServer]
                    > (catalina-exec-13:ctx-a1d38347 ctx-960796a5) (logid:3404c663) The
user with
                    > id:31 is not allowed to request the API command or the API command
does not
                    > exist: listCapabilities
                    >
                    >     Thanks
                    >     Jason
                    >
                    >     From: Jason Kinsella <jason@cloudpeople.com.au>
                    >     Date: Tuesday, 23 May 2017 at 10:11 pm
                    >     To: "users@cloudstack.apache.org" <users@cloudstack.apache.org>
                    >     Subject: SSVM NIO SSL Handshake error
                    >
                    >     Hi,
                    >     We recently upgraded from 4.5.0 to 4.9.2.0 and encountered a
problem
                    > with the SSVM and Console Proxy. They cannot connect to the management
                    > server. The SSVM cloud.log repeats this error every couple of seconds.
                    >
                    >     2017-05-23 11:58:22,461 INFO  [utils.nio.NioClient] (main:null)
                    > Connecting to 192.168.12.1:8250
                    >     2017-05-23 11:58:22,465 WARN  [utils.nio.Link] (main:null) This
SSL
                    > engine was forced to close inbound due to end of stream.
                    >     2017-05-23 11:58:22,465 ERROR [utils.nio.Link] (main:null) Failed
to
                    > send server's CLOSE message due to socket channel's failure.
                    >     2017-05-23 11:58:22,466 ERROR [utils.nio.NioClient] (main:null)
SSL
                    > Handshake failed while connecting to host: 192.168.12.1 port: 8250
                    >     2017-05-23 11:58:22,466 ERROR [utils.nio.NioConnection] (main:null)
                    > Unable to initialize the threads.
                    >     java.io.IOException: SSL Handshake failed while connecting to
host:
                    > 192.168.12.1 port: 8250
                    >                     at com.cloud.utils.nio.NioClient.
                    > init(NioClient.java:67)
                    >                     at com.cloud.utils.nio.NioConnection.start(
                    > NioConnection.java:88)
                    >                     at com.cloud.agent.Agent.start(Agent.java:237)
                    >                     at com.cloud.agent.AgentShell.
                    > launchAgent(AgentShell.java:399)
                    >                     at com.cloud.agent.AgentShell.
                    > launchAgentFromClassInfo(AgentShell.java:367)
                    >                     at com.cloud.agent.AgentShell.
                    > launchAgent(AgentShell.java:351)
                    >                     at com.cloud.agent.AgentShell.
                    > start(AgentShell.java:456)
                    >                     at com.cloud.agent.AgentShell.
                    > main(AgentShell.java:491)
                    >     2017-05-23 11:58:22,468 INFO  [utils.exception.CSExceptionErrorCode]
                    > (main:null) Could not find exception: com.cloud.utils.exception.NioConnectionException
                    > in error code list for exceptions
                    >     2017-05-23 11:58:22,468 WARN  [cloud.agent.Agent] (main:null)
NIO
                    > Connection Exception  com.cloud.utils.exception.NioConnectionException:
                    > SSL Handshake failed while connecting to host: 192.168.12.1 port:
8250
                    >
                    >     The setup is very simple. Single management server and ports
are open.
                    >
                    >     Things checked / tried:
                    >
                    >     ·         Destroyed SSVM multiple times – still same problem.
                    >
                    >     ·         SSH to SSVM from MS using ssh -i /var/cloudstack/management/.ssh/id_rsa
                    > -p 3922 root@IPADDRESS – PASS
                    >
                    >     ·         SSVM telnet on 8250 to MS – PASS
                    >
                    >     I’ve also tested a restore of the DB into our working development
                    > 4.9.2.0 server. It also exhibits the handshake errors, so most likely
DB
                    > related.
                    >
                    >     I’ve used up all my skills. Please help
                    >
                    >     Regards,
                    >     Jason
                    >
                    >
                    >
                    >





            rohit.yadav@shapeblue.com
            www.shapeblue.com<http://www.shapeblue.com>
            53 Chandos Place, Covent Garden, London  WC2N 4HSUK
            @shapeblue






        rohit.yadav@shapeblue.com
        www.shapeblue.com<http://www.shapeblue.com>
        53 Chandos Place, Covent Garden, London  WC2N 4HSUK
        @shapeblue








rohit.yadav@shapeblue.com 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message