From issues-return-83076-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Fri Nov 23 09:28:06 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id B3F38180660 for ; Fri, 23 Nov 2018 09:28:05 +0100 (CET) Received: (qmail 23119 invoked by uid 500); 23 Nov 2018 08:28:04 -0000 Mailing-List: contact issues-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list issues@ignite.apache.org Received: (qmail 23108 invoked by uid 99); 23 Nov 2018 08:28:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Nov 2018 08:28:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 44EDE18A547 for ; Fri, 23 Nov 2018 08:28:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id N3VSJEiB5yiF for ; Fri, 23 Nov 2018 08:28:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id EC0FD60D9A for ; Fri, 23 Nov 2018 08:28:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 2F86BE00E1 for ; Fri, 23 Nov 2018 08:28:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B16A221095 for ; Fri, 23 Nov 2018 08:28:00 +0000 (UTC) Date: Fri, 23 Nov 2018 08:28:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@ignite.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (IGNITE-10354) Failing client node due to not receiving metrics updates MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/IGNITE-10354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696501#comment-16696501 ] ASF GitHub Bot commented on IGNITE-10354: ----------------------------------------- GitHub user gromtech opened a pull request: https://github.com/apache/ignite/pull/5485 IGNITE-10354 Failing client node due to not receiving metrics updates You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-10354 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/5485.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #5485 ---- commit 5032b5a546c38f9a1d34325bb7a7ea39c66e46e1 Author: Roman Guseinov Date: 2018-11-23T08:26:06Z IGNITE-10354 Failing client node due to not receiving metrics updates ---- > Failing client node due to not receiving metrics updates > -------------------------------------------------------- > > Key: IGNITE-10354 > URL: https://issues.apache.org/jira/browse/IGNITE-10354 > Project: Ignite > Issue Type: Bug > Components: clients > Affects Versions: 2.6 > Reporter: Roman Guseinov > Assignee: Roman Guseinov > Priority: Major > Attachments: ClientDisconnectedTest.java > > > In some cases after the coordinator change, the client node can be failed before it can establish a connection to another server from the cluster. > {code:java} > [2018-11-21 12:21:45,769][WARN ][tcp-disco-msg-worker-#15%server-b%][TestTcpDiscoverySpi] Failing client node due to not receiving metrics updates from client node within 'IgniteConfiguration.clientFailureDetectionTimeout' (consider increasing configuration property) [timeout=10000, node=TcpDiscoveryNode [id=dc739711-f685-45e8-9017-1f91b1d86c8c, addrs=[0:0:0:0:0:0:0:1, 10.0.75.1, 127.0.0.1, 192.168.1.51, 192.168.192.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, LAPTOP-6FN8RAOS/10.0.75.1:0, /127.0.0.1:0, /192.168.192.1:0, /192.168.1.51:0], discPort=0, order=2, intOrder=2, lastExchangeTime=1542774105666, loc=false, ver=2.4.0#20180830-sha1:345c0a7c, isClient=true]] > [2018-11-21 12:21:45,791][INFO ][tcp-client-disco-msg-worker-#10%client%][TestTcpDiscoverySpi] Client node disconnected from cluster, will try to reconnect with new id [newId=46812956-2fc4-4b74-9909-d523a547ba0e, prevId=dc739711-f685-45e8-9017-1f91b1d86c8c, locNode=TcpDiscoveryNode [id=dc739711-f685-45e8-9017-1f91b1d86c8c, addrs=[0:0:0:0:0:0:0:1, 10.0.75.1, 127.0.0.1, 192.168.1.51, 192.168.192.1], sockAddrs=[/0:0:0:0:0:0:0:1:0, LAPTOP-6FN8RAOS/10.0.75.1:0, /127.0.0.1:0, /192.168.192.1:0, /192.168.1.51:0], discPort=0, order=2, intOrder=0, lastExchangeTime=1542774104031, loc=true, ver=2.4.0#20180830-sha1:345c0a7c, isClient=true]] > {code} > It looks like a race condition. > Steps to reproduce: > 1. Start server A. > 2. Start client. > 3. Start server B. > 4. Stop server A. > If add Thread.sleep(10000) between (3) and (4) then the client node won't be disconnected from the cluster. > Reproducer is attached [^ClientDisconnectedTest.java]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)