From user-return-29854-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Wed Apr 15 10:04:10 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 03B0118065C for ; Wed, 15 Apr 2020 12:04:09 +0200 (CEST) Received: (qmail 52880 invoked by uid 500); 15 Apr 2020 10:04:09 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 52870 invoked by uid 99); 15 Apr 2020 10:04:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Apr 2020 10:04:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 6E711C14FF for ; Wed, 15 Apr 2020 10:04:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.018 X-Spam-Level: X-Spam-Status: No, score=-1.018 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.818, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id yafZwOWWISUI for ; Wed, 15 Apr 2020 10:04:04 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.167.47; helo=mail-lf1-f47.google.com; envelope-from=rajan.ahlawat@gmail.com; receiver= Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 136F9BB934 for ; Wed, 15 Apr 2020 10:04:04 +0000 (UTC) Received: by mail-lf1-f47.google.com with SMTP id t11so2165083lfe.4 for ; Wed, 15 Apr 2020 03:04:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=y/AvEiWT8Aa+aXlIQA1HX0C926Mwp4MC3PZSURQI8o4=; b=H2hecY9uQDP7Oo2AqxR1mPLUi6KpotuBkODJ2yysBXTK7jZrR1ZKR4jR3BcSruxy0R sqwL1UZlXGWoMfcX5UMXTqp/4Mn9INLF0RX0C4O4Aw340iRNzjalnT9aAXLN++URL7IX WLtnhwZ4WswrY9vfg9irdx39ikFJTrN10sNC2QVmyOnkWUuF9K5YJSzBcy6JgdWGmTya xICe+472N8govXBfB56R2U5mfjQSk1KPSn4AE2T4DT3i1nqFI+v7AqNeQVz09oFZ8RmW GtZb8KyTGpED1Ve6AMf0NGwZQQ7g4iTfmtQ85SbAnc0mfafozXnACXQ64JFlP/wev73d CC/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=y/AvEiWT8Aa+aXlIQA1HX0C926Mwp4MC3PZSURQI8o4=; b=TakXqkfOXkHWsJJsf8XMWIqx4TV5+3suxFrKefWM7WsV+eTSFns1PzrzuscSgmUVeU LuCGDiKL/f9jmE91zMKOOiAivQvO8o9ri8lI4pfP160pK65IP5B7eiwn9Anayve3BBvM CBq4Y44HqUqHHTkeUDTdmSkd+2zxAKAq41SVMJ9PtMGEPXvEvtrMj/7WL67R33m/+3ju BzW7Def5iRTLY519bKVEgeUoJSgS7Q3CVdjRvFBX03S6G0WEEG+PDyPh7u/8xFDHBbVY eBmWOjVayJWnCP/13nmvwQzR0OxkOE0vGIoLCg0JSJu65RW96Y8bO1mrsi6P1O9PargY vgDg== X-Gm-Message-State: AGi0PuaCCCz/tbfiizdGtqhI6DeNri7hLN9Zk19nCt1EHxHZiXqeM5Q7 zxpZWbVMn8cXky1q6dmg9ylf5XqpNY5xBQ9qYOcDZPSDp5obRA== X-Google-Smtp-Source: APiQypIazG4V5ZGKt04pr7HARtz/CyU0ue0NREewFIXWv0VKwuv9faT68SGsg5bsQiOaVXt4/fmLb4FUla3OQdf22MY= X-Received: by 2002:ac2:5e65:: with SMTP id a5mr2574691lfr.189.1586945042371; Wed, 15 Apr 2020 03:04:02 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Rajan Ahlawat Date: Wed, 15 Apr 2020 15:33:50 +0530 Message-ID: Subject: Re: org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi - Failed to reconnect to cluster (will retry): class o.a.i.IgniteCheckedException: Failed to deserialize object with given class loader: org.springframework.boot.loader.LaunchedURLClassLoader To: user@ignite.apache.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Shared file with email-id: e.zhuravlev.wk@gmail.com We have single instance of ignite, File contains all log of date Mar 30, 2019. Line 6429 is the first incident of occurrence. On Tue, Apr 14, 2020 at 8:27 PM Evgenii Zhuravlev wrote: > > Can you provide full log files from all nodes? it's impossible to find th= e root cause from this. > > Evgenii > > =D0=B2=D1=82, 14 =D0=B0=D0=BF=D1=80. 2020 =D0=B3. =D0=B2 07:49, Rajan Ahl= awat : >> >> server starts with following configuration: >> >> ignite_application-1-2020-03-17.log:14:[2020-03-17T08:23:33,664][INFO >> ][main][IgniteKernal%igniteStart] IgniteConfiguration >> [igniteInstanceName=3DigniteStart, pubPoolSize=3D32, svcPoolSize=3D32, >> callbackPoolSize=3D32, stripedPoolSize=3D32, sysPoolSize=3D30, >> mgmtPoolSize=3D4, igfsPoolSize=3D32, dataStreamerPoolSize=3D32, >> utilityCachePoolSize=3D32, utilityCacheKeepAliveTime=3D60000, >> p2pPoolSize=3D2, qryPoolSize=3D32, >> igniteHome=3D/home/patrochandan01/ignite/apache-ignite-fabric-2.6.0-bin, >> igniteWorkDir=3D/home/patrochandan01/ignite/apache-ignite-fabric-2.6.0-b= in/work, >> mbeanSrv=3Dcom.sun.jmx.mbeanserver.JmxMBeanServer@6f94fa3e, >> nodeId=3D53396cb7-1b66-43da-bf10-ebb5f7cc9693, >> marsh=3Dorg.apache.ignite.internal.binary.BinaryMarshaller@42b3b079, >> marshLocJobs=3Dfalse, daemon=3Dfalse, p2pEnabled=3Dfalse, netTimeout=3D5= 000, >> sndRetryDelay=3D1000, sndRetryCnt=3D3, metricsHistSize=3D10000, >> metricsUpdateFreq=3D2000, metricsExpTime=3D9223372036854775807, >> discoSpi=3DTcpDiscoverySpi [addrRslvr=3Dnull, sockTimeout=3D0, ackTimeou= t=3D0, >> marsh=3Dnull, reconCnt=3D100, reconDelay=3D10000, maxAckTimeout=3D600000= , >> forceSrvMode=3Dfalse, clientReconnectDisabled=3Dfalse, internalLsnr=3Dnu= ll], >> segPlc=3DSTOP, segResolveAttempts=3D2, waitForSegOnStart=3Dtrue, >> allResolversPassReq=3Dtrue, segChkFreq=3D10000, >> commSpi=3DTcpCommunicationSpi [connectGate=3Dnull, connPlc=3Dnull, >> enableForcibleNodeKill=3Dfalse, enableTroubleshootingLog=3Dfalse, >> srvLsnr=3Dorg.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$2@= 6692b6c6, >> locAddr=3Dnull, locHost=3Dnull, locPort=3D47100, locPortRange=3D100, >> shmemPort=3D-1, directBuf=3Dtrue, directSndBuf=3Dfalse, >> idleConnTimeout=3D600000, connTimeout=3D5000, maxConnTimeout=3D600000, >> reconCnt=3D10, sockSndBuf=3D32768, sockRcvBuf=3D32768, msgQueueLimit=3D1= 024, >> slowClientQueueLimit=3D1000, nioSrvr=3Dnull, shmemSrv=3Dnull, >> usePairedConnections=3Dfalse, connectionsPerNode=3D1, tcpNoDelay=3Dtrue, >> filterReachableAddresses=3Dfalse, ackSndThreshold=3D32, >> unackedMsgsBufSize=3D0, sockWriteTimeout=3D2000, lsnr=3Dnull, >> boundTcpPort=3D-1, boundTcpShmemPort=3D-1, selectorsCnt=3D16, >> selectorSpins=3D0, addrRslvr=3Dnull, >> ctxInitLatch=3Djava.util.concurrent.CountDownLatch@1cd629b3[Count =3D 1]= , >> stopping=3Dfalse, >> metricsLsnr=3Dorg.apache.ignite.spi.communication.tcp.TcpCommunicationMe= tricsListener@589da3f3], >> evtSpi=3Dorg.apache.ignite.spi.eventstorage.NoopEventStorageSpi@39d76cb5= , >> colSpi=3DNoopCollisionSpi [], deploySpi=3DLocalDeploymentSpi [lsnr=3Dnul= l], >> indexingSpi=3Dorg.apache.ignite.spi.indexing.noop.NoopIndexingSpi@1cb346= ea, >> addrRslvr=3Dnull, clientMode=3Dfalse, rebalanceThreadPoolSize=3D1, >> txCfg=3Dorg.apache.ignite.configuration.TransactionConfiguration@4c01256= 3, >> cacheSanityCheckEnabled=3Dtrue, discoStartupDelay=3D60000, >> deployMode=3DSHARED, p2pMissedCacheSize=3D100, locHost=3Dnull, >> timeSrvPortBase=3D31100, timeSrvPortRange=3D100, >> failureDetectionTimeout=3D10000, clientFailureDetectionTimeout=3D30000, >> metricsLogFreq=3D60000, hadoopCfg=3Dnull, >> connectorCfg=3Dorg.apache.ignite.configuration.ConnectorConfiguration@14= a50707, >> odbcCfg=3Dnull, warmupClos=3Dnull, atomicCfg=3DAtomicConfiguration >> [seqReserveSize=3D1000, cacheMode=3DPARTITIONED, backups=3D1, aff=3Dnull= , >> grpName=3Dnull], classLdr=3Dnull, sslCtxFactory=3Dnull, platformCfg=3Dnu= ll, >> binaryCfg=3Dnull, memCfg=3Dnull, pstCfg=3Dnull, >> dsCfg=3DDataStorageConfiguration [sysRegionInitSize=3D41943040, >> sysCacheMaxSize=3D104857600, pageSize=3D0, concLvl=3D25, >> dfltDataRegConf=3DDataRegionConfiguration [name=3DDefault_Region, >> maxSize=3D20971520, initSize=3D15728640, swapPath=3Dnull, >> pageEvictionMode=3DRANDOM_2_LRU, evictionThreshold=3D0.9, >> emptyPagesPoolSize=3D100, metricsEnabled=3Dfalse, >> metricsSubIntervalCount=3D5, metricsRateTimeInterval=3D60000, >> persistenceEnabled=3Dfalse, checkpointPageBufSize=3D0], storagePath=3Dnu= ll, >> checkpointFreq=3D180000, lockWaitTime=3D10000, checkpointThreads=3D4, >> checkpointWriteOrder=3DSEQUENTIAL, walHistSize=3D20, walSegments=3D10, >> walSegmentSize=3D67108864, walPath=3Ddb/wal, >> walArchivePath=3Ddb/wal/archive, metricsEnabled=3Dfalse, walMode=3DLOG_O= NLY, >> walTlbSize=3D131072, walBuffSize=3D0, walFlushFreq=3D2000, >> walFsyncDelay=3D1000, walRecordIterBuffSize=3D67108864, >> alwaysWriteFullPages=3Dfalse, >> fileIOFactory=3Dorg.apache.ignite.internal.processors.cache.persistence.= file.AsyncFileIOFactory@4bd31064, >> metricsSubIntervalCnt=3D5, metricsRateTimeInterval=3D60000, >> walAutoArchiveAfterInactivity=3D-1, writeThrottlingEnabled=3Dfalse, >> walCompactionEnabled=3Dfalse], activeOnStart=3Dtrue, autoActivation=3Dtr= ue, >> longQryWarnTimeout=3D3000, sqlConnCfg=3Dnull, >> cliConnCfg=3DClientConnectorConfiguration [host=3Dnull, port=3D10800, >> portRange=3D100, sockSndBufSize=3D0, sockRcvBufSize=3D0, tcpNoDelay=3Dtr= ue, >> maxOpenCursorsPerConn=3D128, threadPoolSize=3D32, idleTimeout=3D0, >> jdbcEnabled=3Dtrue, odbcEnabled=3Dtrue, thinCliEnabled=3Dtrue, >> sslEnabled=3Dfalse, useIgniteSslCtxFactory=3Dtrue, sslClientAuth=3Dfalse= , >> sslCtxFactory=3Dnull], authEnabled=3Dfalse, failureHnd=3Dnull, >> commFailureRslvr=3Dnull] >> >> >> >> and error while connecting client: >> >> [2020-04-14T09:41:33,547][WARN >> ][grid-timeout-worker-#71%igniteStart%][TcpDiscoverySpi] Socket write >> has timed out (consider increasing 'sockTimeout' configuration >> property) [sockTimeout=3D5000, rmtAddr=3D/10.80.104.224:51856, >> rmtPort=3D51856, sockTimeout=3D5000] >> >> In server configuration we didn't define any socketTimeout, server >> might be throwing socket timeout not client. But It occurs for only >> one particular client and this server. Other web applications are able >> to connect with same server on our production environment. >> >> Thanks >> >> On Mon, Apr 13, 2020 at 8:09 PM Evgenii Zhuravlev >> wrote: >> > >> > Hi, >> > >> > Can you share full logs from all nodes? I mean log files, not the cons= ole output. >> > >> > Evgenii >> > >> > =D0=B2=D1=81, 12 =D0=B0=D0=BF=D1=80. 2020 =D0=B3. =D0=B2 20:30, Rajan = Ahlawat : >> >> >> >> ? >> >> >> >> On Thu, Apr 9, 2020 at 3:11 AM Rajan Ahlawat wrote: >> >> > >> >> > ---------- Forwarded message --------- >> >> > From: Rajan Ahlawat >> >> > Date: Thu, Apr 9, 2020 at 3:09 AM >> >> > Subject: org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi - Fail= ed >> >> > to reconnect to cluster (will retry): class >> >> > o.a.i.IgniteCheckedException: Failed to deserialize object with giv= en >> >> > class loader: org.springframework.boot.loader.LaunchedURLClassLoade= r >> >> > To: >> >> > >> >> > >> >> > Hi >> >> > >> >> > We suddenly started getting following exception on client side afte= r >> >> > node running application got restarted: >> >> > >> >> > org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi - Failed to >> >> > reconnect to cluster (will retry): class o.a.i.IgniteCheckedExcepti= on: >> >> > Failed to deserialize object with given class loader: >> >> > org.springframework.boot.loader.LaunchedURLClassLoader >> >> > >> >> > I see similar bug was raised here for version 2.7.0: >> >> > https://issues.apache.org/jira/browse/IGNITE-11730 >> >> > >> >> > We are currently using version 2.6.0 >> >> > Following is our tcpDiscoveryApi configurations: >> >> > >> >> > private void setDiscoverySpiConfig(IgniteConfiguration cfg) { >> >> > TcpDiscoverySpi discoverySpi =3D new TcpDiscoverySpi(); >> >> > >> >> > setIpFinder(discoverySpi); >> >> > discoverySpi.setNetworkTimeout(platformCachingConfiguration.get= Ignite().getSocketTimeout()); >> >> > discoverySpi.setSocketTimeout(platformCachingConfiguration.getI= gnite().getSocketTimeout()); >> >> > discoverySpi.setJoinTimeout(platformCachingConfiguration.getIgn= ite().getJoinTimeout()); >> >> > discoverySpi.setClientReconnectDisabled(platformCachingConfigur= ation.getIgnite().isClientReconnectDisabled()); >> >> > discoverySpi.setReconnectCount(platformCachingConfiguration.get= Ignite().getReconnectCount()); >> >> > discoverySpi.setReconnectDelay(platformCachingConfiguration.get= Ignite().getReconnectDelay()); >> >> > >> >> > cfg.setDiscoverySpi(discoverySpi); >> >> > } >> >> > >> >> > Its IPfinder config is >> >> > >> >> > private void setTcpIpFinder(TcpDiscoverySpi discoverySpi) { >> >> > TcpDiscoveryVmIpFinder ipFinder =3D new TcpDiscoveryVmIpFinder(= ); >> >> > >> >> > ipFinder.setAddresses(platformCachingConfiguration.getIgnite().= getNodes()); >> >> > discoverySpi.setIpFinder(ipFinder); >> >> > } >> >> > >> >> > We have tried every combination of timeouts, right now timeouts are >> >> > set at very hight value . >> >> > >> >> > (1) If we are having same bug mentioned for 2.7.0 version, but bug >> >> > desc says it occurs on server side, but we are getting exact same >> >> > stack trance in ClientImpl.java on client side. >> >> > (2) assuming it is same issues, is there a way to disable data bag >> >> > compression check, since upgrading both client and server version >> >> > would not be possible immediately. >> >> > >> >> > Thanks in advance.