ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanislav Lukyanov <stanlukya...@gmail.com>
Subject RE: Question about client disco-/reconnect behaviour
Date Mon, 19 Mar 2018 15:59:39 GMT
Oh, sure, I guess I need to be more precise.

>From the “clustering” point of view (i.e. considering long-term connections that hold
the cluster together) the client has a single server that it is connected to.
The clustering part is handled by the Discovery SPI subsystem (and its default implementation

During the cluster lifetime there are peer-to-peer connections created between all nodes,
clients and servers, which will transfer most of the data (all cache operations, etc).
These connections are created and closed as needed. This is handled by the Communication SPI
subsystem (and its default implementation TcpCommunicationSpi).

What you see in the logs below is the client connecting to both servers to perform cache or
other operations (note the TcpCommunicationSpi class), but only one server is responsible
for making sure the client is connected and alive.

About the documentation: I believe we don’t have this explained in detail yet, but some
work is going on to improve the docs on networking in Ignite a bit – stay tuned.


From: Bellenger, Dominique
Sent: 19 марта 2018 г. 18:21
To: user@ignite.apache.org
Subject: AW: Question about client disco-/reconnect behaviour

Hey Stan,
thank you for your answer. Does there exist some form of documentation about that behaviour
(besides the code itself of course 😊 )

My observations show that the client is indeed connecting to both servers. When both servers
are running and I start the client I get the following output from the server logs:

First try
15:15:00.0941|DEBUG|TcpCommunicationSpi|Accepted new client connection: /
15:15:00.0941|INFO    |TcpCommunicationSpi|Accepted incoming communication connection [locAddr=/,
15:15:00.1097|DEBUG|TcpCommunicationSpi|Sending local node ID to newly accepted session: GridSelectorNioSessionImpl
[ ... ]
15:15:00.1097|DEBUG|TcpCommunicationSpi|Remote node ID received: 41d75cf0-a42e-48e2-b23f-61f1284bd189
15:15:00.1097|DEBUG|TcpCommunicationSpi|Received handshake message [locNodeId=584e9592-b423-4da4-a37d-f65e080473cf,
rmtNodeId=41d75cf0-a42e-48e2-b23f-61f1284bd189, msg=HandshakeMessage2 [connIdx=0]]

15:15:02.1256|DEBUG|TcpCommunicationSpi|Accepted new client connection: /
15:15:02.1256|INFO    |TcpCommunicationSpi|Accepted incoming communication connection [locAddr=/,
15:15:02.1722|DEBUG|TcpCommunicationSpi|Sending local node ID to newly accepted session: GridSelectorNioSessionImpl
[ ... ]
15:15:02.1892|DEBUG|TcpCommunicationSpi|Remote node ID received: 41d75cf0-a42e-48e2-b23f-61f1284bd189
15:15:02.1892|DEBUG|TcpCommunicationSpi|Received handshake message [locNodeId=2c190af0-113f-4a12-90a2-757b5ab89220,
rmtNodeId=41d75cf0-a42e-48e2-b23f-61f1284bd189, msg=HandshakeMessage2 [connIdx=0]]

Second try
16:12:18.1313|DEBUG|TcpCommunicationSpi|Accepted new client connection: /0:0:0:0:0:0:0:1:60600
16:12:18.1313|INFO |TcpCommunicationSpi|Accepted incoming communication connection [locAddr=/0:0:0:0:0:0:0:1:47101,
16:12:18.1438|DEBUG|TcpCommunicationSpi|Sending local node ID to newly accepted session: GridSelectorNioSessionImpl
[ ... ]
16:12:18.2365|DEBUG|TcpCommunicationSpi|Remote node ID received: f0c43cea-709d-420d-8c88-420a7ac3998d
16:12:18.2551|DEBUG|TcpCommunicationSpi|Received handshake message [locNodeId=465739c3-1c7c-4cb3-812b-de0c05315304,
rmtNodeId=f0c43cea-709d-420d-8c88-420a7ac3998d, msg=HandshakeMessage2 [connIdx=0]]

16:12:17.8776|INFO |TcpCommunicationSpi|Accepted incoming communication connection [locAddr=/0:0:0:0:0:0:0:1:47100,
16:12:17.8776|DEBUG|TcpCommunicationSpi|Sending local node ID to newly accepted session: GridSelectorNioSessionImpl
[ ... ]
16:12:17.8916|DEBUG|TcpCommunicationSpi|Remote node ID received: f0c43cea-709d-420d-8c88-420a7ac3998d
16:12:17.9397|DEBUG|TcpCommunicationSpi|Received handshake message [locNodeId=be1a9f68-2e46-4e9f-8397-9b25a066d9cc,
rmtNodeId=f0c43cea-709d-420d-8c88-420a7ac3998d, msg=HandshakeMessage2 [connIdx=0]]

Note that using the exact same configuration the communication is established using IPv4 the
first time and using IPv6 the second time.


Von: Stanislav Lukyanov <stanlukyanov@gmail.com> 
Gesendet: 19 March 2018 13:52
An: user@ignite.apache.org
Betreff: RE: Question about client disco-/reconnect behaviour


Yes, that’s the expected behavior.
The client is connected to a single server, not both. If the server it’s connected to is
killed, the client will reconnect, producing the events in the process.
When you kill one of the servers and the client doesn’t get disconnected it’s probably
because you’ve killed the wrong one (not the one client is connected to).


From: Bellenger, Dominique
Sent: 19 марта 2018 г. 15:32
To: user@ignite.apache.org
Subject: Question about client disco-/reconnect behaviour

Hello igniters,
I have a question about expected client reconnection behaviour.
I have two server nodes and one client node. When everything is connected and one of the servers
fails (because it is killed) the client is supposed to

1) Connect to the remaining server transparently, no Disconnect event, no reconnect event
2) Do nothing because it is connected to both servers and switches to the remaining connection
silently. No Disconnect event, no Reconnect event.
3) Raise a disconnect event, connect to the remaining server, raise a reconnect event

I observe the following behaviour and just wanted to know, if that is the expected one using
Apache Ignite .NET 2.3 (also matches 2.4).

If everything started I kill one of the server nodes and sometimes the client is disconnected
and reconnects to the remaining server after a while. It does, however, not occur every time
I kill one of the servers. In most cases I am successful forcing a reconnect if I do the following:
- Start one server and the client (order doesn’t matter)
- Start the second server and wait until everything is connected
- Kill the first server
- Client gets disconnected / reconnected
So: is that the desired behaviour?

Thanks in advance,

View raw message