ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Menshikov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-4501) Improvement of connection in a cluster of new node
Date Thu, 13 Apr 2017 14:54:41 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967679#comment-15967679

Alexander Menshikov commented on IGNITE-4501:

I do my best. I hope I will manage to do it before 2.0 release. Problem in ServerImpl.RingMessageWorker#processNodeAddedMessage()
with NPE in lines:

DiscoveryDataPacket dataPacket = msg.gridDiscoveryData();
if (dataPacket.hasJoiningNodeData())
  .... //^___Here, because dataPacket is null

I can fix it if remove one line:


#processNodeAddedMessage() is real complex, it's take a while to understand what going on
in here.

> Improvement of connection in a cluster of new node
> --------------------------------------------------
>                 Key: IGNITE-4501
>                 URL: https://issues.apache.org/jira/browse/IGNITE-4501
>             Project: Ignite
>          Issue Type: Improvement
>          Components: messaging
>    Affects Versions: 1.8
>            Reporter: Vyacheslav Daradur
>            Assignee: Alexander Menshikov
>             Fix For: 2.0
> h3. Main description:
> Cluster nodes connect a ring.
> For example: we have 6 nodes: A, B, C, D, E, F. 
> They can connect a ring in any possible way: A-B-C-D-E-F-A, or A-F-B-E-C-D-A, etc.
> If some node leaves topology, adjacent nodes must reconnect. 
> If nodes A, B, C are in same physical place, nodes D, E, F are in other place, and places
lost connect each other, we will have many ways of reconnections.
> At best case, if we had a ring: A-B-CxD-E-FxA ('x' means disconnect) -- then we have
only one reconnect (C
> will be connected to A or F will be connected to D -- depends on what part of the cluster
was alive.
> Also, if we had a not ring: AxFxBxExCxDxA -- then we have a lot of reconnections (A to
B, B to C, C to A -- in general n/2 reconnections, where n -- number of nodes). 
> h3. Approach:
> It is necessary to develop approach of node insertion to the correct place for creation
of the correct ring-topology.
> h3. Solutions:
> Main idea is a sorting according to latency.
> * group nodes in arcs on an ARC_ID. (manualy?)
> * implement NodeComparator (nodes on the same host : nodes on the same subnet : other
nodes). We will use it when we connect a new node.
> * [dev list thread|http://mail-archives.apache.org/mod_mbox/ignite-dev/201612.mbox/%3CCAN+WSNyWYXSXEBpGErVt72zTgi2pTQzUWLv8JY=Ke83-5-Rh9g@mail.gmail.com%3E]
> Update Dec, 29 Yakov Zhdanov:
> # introduce CLUSTER_REGION_ID node attribute. This can be done by adding public static
final constant to TcpDiscoverySpi.
> # Alter org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNodesRing#nextNode(java.util.Collection<org.apache.ignite.spi.discovery.tcp.internal.TcpDiscoveryNode>)
to order basing on per node attribute value
> # Node comparison should be stable and consistent. E.g. if CLUSTER_REGION_IDs are equal
then we should compare nodes' IDs. This way we have consistent order on all nodes in topology.
> # Also nextNode() has to group nodes on same host and in same subnet. This can be postponed
and implemented after we have other points done.

This message was sent by Atlassian JIRA

View raw message