guacamole-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Wolf <dan...@strigo.io>
Subject Connection problems with RDP
Date Mon, 25 Feb 2019 09:33:38 GMT
Hello all,

*Bottom line:*

We have clients connect with RDP via Guacamole to Windows 2016 Server
instances running on AWS.
We experience:
a) Freezes of the client UI (which are solved when closing the client and
re-establishing a connection, or sometimes after waiting some time).
b) Failure to connect in the client.
Roughly speaking, 1 out 10 clients would experience one of these problems.

*Some background to the setup:*

I have guacd in version 1.0.0 running in a Docker container on an AWS EC2
instance.

I have my own web server in Java, using the guacamole SDK (also version
1.0.0).
This one also runs in a Docker container on a different EC2 instance.
It is pretty straight-forward - creates a tunnel wrapping the socket to the
guacd for every client.

The web server and guacd access each other using Hashicorp's Consul.
A local consul agent runs on both EC2 instances, and when the web server
accesses the guacd, it in fact accesses a certain DNS name which the Consul
service resolves to the actual address of the guacd instance, which is on
the same VPC and subnet as the web server.

We wrote our own client, using guacamole-common-js (also version 1.0.0).
Websockets the client app opens against the web server go through AWS ALB
and then through eBay's Fabio load balancer (they also expose the web
server as HTTPS).

It's also worth mentioning that we have multiple services of guacd and web
server in multiple containers, with the ALB and Fabio load balancing them.

*More info I have:*

1. We ran guacd in debug log level, and saw many errors of:
INFO: Guacamole connection closed during handshake
DEBUG: Error reading "select": End of stream reached while reading
instruction

2. We also see these errors in guacd when trying to establish the RDP
connection for the first time:
certificate_store_open: error opening [/root/.config/freerdp/known_hosts]
for writing
unexpected pubKeyAuth buffer size :0
Could not verify public key echo!
Authentication failure, check credentials.
If credentials are valid, the NTLMSSP implementation may be to blame.
Error: protocol security negotiation or connection failure

3. We also see the line in the guacd logs when establishing the RDP
connection:
Unable to find a match for unix timezone: Etc/UTC

4. In the Guac configuration for the connection, we use:
security: any
ignore-cert: true

5. We also use enable-drive: true.
The drive's path is an AWS EFS drive we mount on the guacd instance.

6. On the client we see 514 error codes.
We can't match them with the symptoms we see - i.e. the disconnections or
freezes.

7. We do see that upon connecting, both the guacd container and the web
server container have CPU load (it could be higher than 100% utilization
when opening ~20 connections).
The specs of the containers themselves:
guacd runs with 2000mhz / 6gb / 30mbits.
The web server runs with 2000mhz / 2gb / 10mbits.

8. We don't see any load on the web traffic, in the client or on the
servers..

9. In our app we have different clients connecting to the same machine
desktop using the same guac connection.
Each client has a tunnel of its own to the web server, which then shares
the guacd connection between the clients.
Not sure if this is problematic, but worth mentioning...

10. We made a change to our IT to have only one instance of guacd and web
server containers, to reduce possible friction points.

*To sum up:*

I think this is pretty much all of the information I have.
As you can see, we fail to pinpoint where the problem could be.

I'd appreciate if you can offer some direction - what we can check, how we
can check, if you ran into something similar, if there's something that
looks suspicious in this whole setup.

Many thanks!
Daniel

Mime
View raw message