ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@gridgain.com>
Subject Re: IP based discovery, dynamic nodes, and start() deadlock
Date Mon, 02 Nov 2015 15:36:47 GMT
Hello Erik,

I got to the bottom of your issue, thanks for sharing the code.

The reason why you have the deadlock is because the static IP finders 
don't have a local node's address in their list.
If an IP finder has a local address in the list then the discovery will 
allow the node with this address to form a single-node cluster.

When you set the IP finder the way below the issue will disappear on 
your side.

TcpDiscoveryVmIpFinder ipFinder =new TcpDiscoveryVmIpFinder();

ArrayList<String> addresses =new ArrayList<>();
addresses.add("127.0.0.1:5000");
addresses.add("127.0.0.1:6000");


So as you see the IP finder has to contain addresses of all the nodes 
(including local node's one) that must form a cluster.


Next as I see you bind each node to a specific address. As you may know 
in addition to DiscoverySpi there is a CommunicationSpi that is heavily 
used by cluster nodes when one needs to reach the other directly.
You may want/need CommunicationSpi to bind to a specific address as well.
If this is your case I would recommend you to set the host part of the 
address through IgniteConfiguration

IgniteConfiguration icfg =new IgniteConfiguration();

icfg.setLocalHost("127.0.0.1");


Host address will be propagated to both discovery and communication SPIs.
After that you just need to change discovery and communication local 
ports to bind to if required.

TcpDiscoverySpi spi =new TcpDiscoverySpi();
spi.setLocalPort(6000);

TcpCommunicationSpi cSpi =new TcpCommunicationSpi();
cSpi.setLocalPort(48000);


Finally, answering on your question
>> Just to confirm, what would the Ignite topology be after the following
>> sequences?
>> Node A starts (no peers)
>> Node B starts (no peers)
>> Node C starts, connects to A
>> Node D starts, connects to B
>> -- I assume here we will have two isolated clusters
> If A and C is from one network segment and B and D is from the other then
> you'll have two clusters. If all the nodes from one network segment then
> you'll have one cluster.
> I omitted a bit of context there: the above sequences, with strictly static IP discovery.
We shouldn't have any cross-talk yet in that scenario, right?

If IP finders of all the nodes (A, B, C and D) have addresses of each 
other then Ignite will assemble 4 nodes cluster. If IP finders of nodes 
A and B does NOT contain addresses of nodes C and D and vice verse then 
you have two clusters: A and B cluster, C and D cluster. Hope my answer 
made the things clearer for you, Denis
On 11/2/2015 2:10 PM, Erik Bunn wrote:
> Hello Denis
>
>
>
>
>
>
> On 02/11/15 10:00, "Denis Magda" <dmagda@gridgain.com> wrote:
>
>> - make sure that each node can reach each other over network. The ports that
>> the nodes are bound to might be closed by your firewall. In any case there
>> is always a way to set ports list to use for an IP finder:
>> https://apacheignite.readme.io/docs/cluster-config#multicast-and-static-ip-based-discovery
> This is confirmed and not the issue here.
>
>> - if you're sure that there is no any network related issue and each node
>> can reach each other then please provide us with a reproducible example.
>> Probably this is a bug and we will be able to fix it.
> My trivial sample class is listed below. It does have log4j/guava/jcommander dependencies,
so I've also put it up at https://github.com/ebudan/ignite-static-test
>
>> In any case if you properly setup the static IP finder then you shouldn't
>> worry about any leader selection or any other discovery related
>> responsibilities. This is done out of the box.
> Happy to hear that. I'm pushing hard for a completely programmatic setup, so maybe I
have omitted an option that resolves the RES_WAIT deadlock. If so, maybe it will be apparent
in the code snippet below. (The test commands below run on localhost, but I tested on separate
servers as well.)
>
>>> Just to confirm, what would the Ignite topology be after the following
>>> sequences?
>>> Node A starts (no peers)
>>> Node B starts (no peers)
>>> Node C starts, connects to A
>>> Node D starts, connects to B
>>> -- I assume here we will have two isolated clusters
>> If A and C is from one network segment and B and D is from the other then
>> you'll have two clusters. If all the nodes from one network segment then
>> you'll have one cluster.
> I omitted a bit of context there: the above sequences, with strictly static IP discovery.
We shouldn't have any cross-talk yet in that scenario, right?
>
> Thanks!
> //eb
>
>
> Sample code to reproduce (https://github.com/ebudan/ignite-static-test):
>
> package net.memecry.ihw;
>
> import java.util.ArrayList;
> import java.util.Collection;
> import java.util.List;
> import java.util.concurrent.atomic.AtomicInteger;
> import org.apache.ignite.Ignite;
> import org.apache.ignite.Ignition;
> import org.apache.ignite.configuration.IgniteConfiguration;
> import org.apache.ignite.lang.IgniteCallable;
> import org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi;
> import org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder;
> import org.apache.log4j.LogManager;
> import org.apache.log4j.Logger;
> import com.beust.jcommander.JCommander;
> import com.beust.jcommander.Parameter;
> import com.beust.jcommander.ParameterException;
> import com.google.common.net.HostAndPort;
>
> /*
>   * A sample cli app to test embedded Ignite and static IP discovery for
>   * cluster setup, and to demonstrate cluster deadlock. (Ignite version 1.4.0,
> * also confirmed for 1.5.0-SNAPSHOT.)
>   *
>   * To run:
>   *     java -jar build/libs/HelloIgnite-all-1.0.jar -a ADDRESS:PORT -p ADDRESS:PORT,ADDRESS:PORT,...
[--task]
>   *
>   * Sets a node up at the discovery address+port specified by option -a, and
>   * connects to some of the nodes running at the address+port specified by -p.
>   * Without further parameters, waits for connections; if --task is specified,
>   * launches a sample compute task from Ignite documentation.
>   *
>   * Ideally, to test with two nodes:
>   *
>   *     java -jar build/libs/HelloIgnite-all-1.0.jar -a 127.0.0.1:5000 -p 127.0.0.1:6000
>   *     java -jar build/libs/HelloIgnite-all-1.0.jar -a 127.0.0.1:6000 -p 127.0.0.1:5000
--task
>   *
>   * In practice, the above deadlocks while the nodes wait for each other's ready signal.
>   * In order to successfully start up, one of the nodes must be launched without peers:
>   *
>   *     java -jar build/libs/HelloIgnite-all-1.0.jar -a 127.0.0.1:5000
>   *     java -jar build/libs/HelloIgnite-all-1.0.jar -a 127.0.0.1:6000 -p 127.0.0.1:5000
--task
>   *
>   */
> public class Main {
>
>      static final Logger log = LogManager.getLogger( Main.class );
>
>      @Parameter( names = { "-t", "--task" }, description = "Perform sample task." )
>      private boolean m_task;
>
>      @Parameter( names = { "-a", "--addr" }, description = "Own IP:port address", required
= true )
>      private String m_addr;
>
>      @Parameter( names = { "-p", "--peers" }, description = "Comma separated list of
IP:port of Ignite peers. Will use given port for Ignite, port+1 for discovery.", variableArity
= true )
>      private List<String> m_peers = new ArrayList<>();
>
>      static int s_count = 0;
>
>      AtomicInteger m_received = new AtomicInteger( 0 );
>
>      private Ignite m_ignite;
>
>      private void go() {
>
>      HostAndPort hp =
>      HostAndPort.fromString( m_addr ).withDefaultPort( 5000 ).requireBracketsForIPv6();
>
>      IgniteConfiguration icfg = new IgniteConfiguration();
>      TcpDiscoverySpi spi = new TcpDiscoverySpi();
>      spi.setLocalAddress( hp.getHostText() );
>      spi.setLocalPort( hp.getPort() );
>      if( m_peers.size() > 0 ) {
>          TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
>          ipFinder.setAddresses( m_peers );
>          spi.setIpFinder( ipFinder );
>          log.debug( "Searching for peers: " + m_peers );
>      }
>      icfg.setDiscoverySpi( spi );
>      log.debug( "Discovery on " + hp.getHostText() + ":" + hp.getPort() );
>          
>      log.debug( "Starting Ignite" );
>      m_ignite = Ignition.start(icfg);
>      log.debug( "Ignite started." );
>
>      if( m_task ) {
>          try {
>              Thread.sleep( 3000 );
>          } catch( InterruptedException e ) {
>          }
>          doSampleTask();
>      }
>
>      // Wait forever, in order to look at cluster establishment.
>      while( true ) {
>          try {
>              synchronized( this ) {
>                  wait();
>              }
>          } catch( InterruptedException e ) {
>              
>          }
>      }
>      }
>
>      // Sample task from Ignite tutorial.
>      private void doSampleTask() {
>      log.debug( "Launching a task." );
>      Collection<IgniteCallable<Integer>> calls = new ArrayList<>();
>      for (final String word : "Count characters using callable".split(" "))
>          calls.add( () -> {
>          log.debug( "Counting " + word.length() + " chars" );
>          return word.length();
>      });
>      Collection<Integer> res = m_ignite.compute().call(calls);
>      int sum = res.stream().mapToInt(Integer::intValue).sum();
>      log.debug( "Total number of characters is '" + sum + "'." );
>      }
>
>      
>      public static void main( String[] args ) {
>
>      Main app = new Main();
>      JCommander jc = new JCommander( app );
>      try {
>          jc.parse( args );
>          app.go();
>      } catch( ParameterException e ) {
>          jc.usage();
>          System.exit( 1 );
>      }
>      }
> }
>

Mime
View raw message