From user-return-19492-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Thu May 31 14:02:49 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 28743180632 for ; Thu, 31 May 2018 14:02:46 +0200 (CEST) Received: (qmail 3996 invoked by uid 500); 31 May 2018 12:02:41 -0000 Mailing-List: contact user-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ignite.apache.org Delivered-To: mailing list user@ignite.apache.org Received: (qmail 3979 invoked by uid 99); 31 May 2018 12:02:41 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 31 May 2018 12:02:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1CD22CDF86 for ; Thu, 31 May 2018 12:02:40 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.89 X-Spam-Level: * X-Spam-Status: No, score=1.89 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, HTML_OBFUSCATE_05_10=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id x74COUW-ZoCg for ; Thu, 31 May 2018 12:02:37 +0000 (UTC) Received: from mail-qt0-f173.google.com (mail-qt0-f173.google.com [209.85.216.173]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id DF3955F3DF for ; Thu, 31 May 2018 12:02:36 +0000 (UTC) Received: by mail-qt0-f173.google.com with SMTP id x34-v6so14199018qtk.5 for ; Thu, 31 May 2018 05:02:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=3ppCKx1UtgWyr+mBvWWjNsm5Ekn55cWGZtvk+7boWFo=; b=rGPl7QTQNUV13LJ4Nh4bEiRsA9bbEPdW9BSpf3nfzZCzbKeXK/pIjcsdRIfmjRl2Gc mzFg1p7bQESJ359vEm4KCIYL/Vxia9BGTWxPk24n09sE9DWjbpxMCVpAz0y6kHo+bZUg cxdf/5z5eD5dcHTEMZVdEpiouCRsERFZwMJ/YgOeWRcfRrlQua9TZ9Nma9BGx7dtxfuC 8R7Koq6phXxwHFYdLT3a2epqmGyQFwHm7zgZUBU24Eaw6fmQ0WofUykv4/vb3c9IhHbr 0Zwo3JRXoHZ6H4hdZGKMnPwANPoxjQWqO7+0mn9jnhzu3PNDfIY/Krc47gkyrW+XFOzp DBvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=3ppCKx1UtgWyr+mBvWWjNsm5Ekn55cWGZtvk+7boWFo=; b=YNESiRnTe0qeDs16dWDJm0XnE4ZFWgv2OfUU6N5EQJ/DUPeK9pax7ZkM5zBAtagkIp JgJiGkHOJUL6RzIDGgv0aiFIX7FbXgxL33Mc7pbtE6w1//vfSOnvjWdh6M86UMQyB21W +VOJyMX1sP0mTyyd1ByAFhtb5q0F/Rdr5IrwvYG9NoeWpuIzc5XweicE7BjmTr6335cV KDXLCLrXzZjm/9AWc202IWB0+QGRzMB3qw9Yq1kfrJZiCPu41Oo8wu2EKX7ONDLoe645 fC/i9YZwUBm/vYqRuElcuNYOdejcu/iM9yA32VFOLAXw77/i3TA467sKcaULLzdcnsYT uB0w== X-Gm-Message-State: APt69E30pY5NQWa3ejcQAiEv/GXsGgS0E7e7vFHxaFKoJrpjHyVgT8tg eXGFWfe0dUHlPDQld+3i8FjmV8hynZHjgdxUW0AH1Q== X-Google-Smtp-Source: ADUXVKK5BqKZBOXqKWq2tVMw3vE/yM76ZYiFiCLLGAuq7Amalb/rviZsJkp2B2810xhuV5ql+rG66oW2yoCDf1rDKow= X-Received: by 2002:aed:3ed8:: with SMTP id o24-v6mr6563488qtf.177.1527768156150; Thu, 31 May 2018 05:02:36 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac8:1bce:0:0:0:0:0 with HTTP; Thu, 31 May 2018 05:02:35 -0700 (PDT) In-Reply-To: References: From: Ilya Kasnacheev Date: Thu, 31 May 2018 15:02:35 +0300 Message-ID: Subject: Re: SQL Query error To: user@ignite.apache.org Content-Type: multipart/alternative; boundary="0000000000003f1e66056d7f3e88" --0000000000003f1e66056d7f3e88 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello! It looks like that your cluster was in hung state (unable to perform partition map exchange when client node exited or entries topology) due to a stuck cache operation. As witnessed by: 2018-05-31 12:10:37,781 [35] WARN org.apache.ignite.internal.diagnostic - Failed to wait for partition release future [topVer=3DAffinityTopologyVersi= on [topVer=3D70 , minorTopVer=3D0], node=3D646cb075-dd64-4a90-a5a8-23f3b97f4d36] 2018-05-31 12:10:37,781 [35] WARN org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridD= htPartitionsExchangeFuture - Partition release futu re: PartitionReleaseFuture [topVer=3DAffinityTopologyVersion [topVer=3D70, minorTopVer=3D0], futures=3D[ExplicitLockReleaseFuture [topVer=3DAffinityTopologyVersion [topVe r=3D70, minorTopVer=3D0], futures=3D[]], TxReleaseFuture [topVer=3DAffinityTopologyVersion [topVer=3D70, minorTopVer=3D0], futures= =3D[]], AtomicUpdateReleaseFuture [topVer=3DAf finityTopologyVersion [topVer=3D70, minorTopVer=3D0], futures=3D[GridDhtAtomicUpdateFuture [updateCntr=3D7, super=3DGridDhtAtomicAbstractUpdateFuture [futId=3D85, resCnt=3D0, addedReader=3Dfalse, dhtRes=3D{}]]]], DataStreamerReleaseFuture [topVer=3DAffinityTopologyVersion [topVer=3D70, minorTopVer=3D0], futures= =3D[]]]] ... 2018-05-31 12:10:37,781 [35] WARN org.apache.ignite.internal.diagnostic - Pending atomic cache futures: 2018-05-31 12:10:37,781 [35] WARN org.apache.ignite.internal.diagnostic - >>> GridDhtAtomicUpdateFuture [updateCntr=3D7, super=3DGridDhtAtomicAbstractUpdateFuture [futId=3D85, resCnt=3D0, addedReader=3Dfalse, dhtRes=3D{}]] Once you have killed one of the nodes, the operation was no longer considered stuck and your cluster un-hung. It's hard to say why this operation failed to complete. I suggest you to collect Java thread dumps (with jstack) from all nodes on the nearest occassion when you notice that cluster is stuck again. Regards, --=20 Ilya Kasnacheev 2018-05-31 14:01 GMT+03:00 St=C3=A9phane Gayet : > Hi Ilya, > > > Thanks for your help. > > > Could you access to the files at https://drive.google.com/ > drive/folders/1apPraUn-Z2wKXFr5Wsdf8qHIQssW0XgC?usp=3Dsharing > > > Regards, > ------------------------------ > *De :* Ilya Kasnacheev > *Envoy=C3=A9 :* jeudi 31 mai 2018 12:32:35 > *=C3=80 :* user@ignite.apache.org > *Objet :* Re: SQL Query error > > Hello! > > Full Ignite logs of the problematic node will be helpful. Can you upload > the log file anywhere? > > Regards, > > -- > Ilya Kasnacheev > > 2018-05-31 12:37 GMT+03:00 St=C3=A9phane Gayet : > > Hi All, > > > Corrections about my previous email. > > > When the cluster stops responding to the sql query, I can identify a > faulting node in the following exception : > > 2018-05-31 11:20:17,036 [75] ERROR ServiceCache - Failed to get RtProposa= l > > Apache.Ignite.Core.Cache.CacheException: Failed to run map query > remotely.Failed to execute map query on the node: > *90bd677d-dee5-44bb-af6f-80786b85bd37*, class > org.apache.ignite.internal.processors.query.IgniteSQLException:Failed to > set schema for DB connection for thread [schema=3Dowproposals] ---> > Apache.Ignite.Core.Common.JavaException: javax.cache.CacheException: > Failed to run map query remotely.Failed to execute map query on the node: > 90bd677d-dee5-44bb-af6f-80786b85bd37, class org.apache.ignite.internal.pr= o > cessors.query.IgniteSQLException:Failed to set schema for DB connection > for thread [schema=3Dowproposals] > > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.query(GridReduceQueryExecutor.java:747) > > at org.apache.ignite.internal.processors.query.h2.IgniteH2Index > ing$8.iterator(IgniteH2Indexing.java:1339) > > at org.apache.ignite.internal.processors.cache.QueryCursorImpl. > iterator(QueryCursorImpl.java:95) > > at org.apache.ignite.internal.processors.platform.cache.query.P > latformAbstractQueryCursor.processInLongOutLong(PlatformAbst > ractQueryCursor.java:147) > > at org.apache.ignite.internal.processors.platform.PlatformTarge > tProxyImpl.inLongOutLong(PlatformTargetProxyImpl.java:55) > > Caused by: javax.cache.CacheException: Failed to execute map query on the > node: 90bd677d-dee5-44bb-af6f-80786b85bd37, class > org.apache.ignite.internal.processors.query.IgniteSQLException:Failed to > set schema for DB connection for thread [schema=3Dowproposals] > > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.fail(GridReduceQueryExecutor.java:275) > > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.onFail(GridReduceQueryExecutor.java:265) > > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.onMessage(GridReduceQueryExecutor.java:244) > > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor$2.onMessage(GridReduceQueryExecutor.java:188) > > at org.apache.ignite.internal.managers.communication.GridIoMana > ger$ArrayListener.onMessage(GridIoManager.java:2332) > > at org.apache.ignite.internal.managers.communication.GridIoMana > ger.invokeListener(GridIoManager.java:1555) > > at org.apache.ignite.internal.managers.communication.GridIoMana > ger.processRegularMessage0(GridIoManager.java:1183) > > at org.apache.ignite.internal.managers.communication.GridIoMana > ger.access$4200(GridIoManager.java:126) > > at org.apache.ignite.internal.managers.communication.GridIoMana > ger$9.run(GridIoManager.java:1090) > > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > > at java.lang.Thread.run(Unknown Source) > > > =C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.ExceptionCheck() > > =C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.CallLongMethod(Global= Ref > obj, IntPtr methodId, Int64* argsPtr) > > =C3=A0 Apache.Ignite.Core.Impl.Unmanaged.UnmanagedUtils.TargetInLongOu= tLong(GlobalRef > target, Int32 opType, Int64 memPtr) > > =C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.InLongOutLong(Int32 t= ype, > Int64 val) > > --- Fin de la trace de la pile d'exception interne --- > > =C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.InLongOutLong(Int32 t= ype, > Int64 val) > > =C3=A0 Apache.Ignite.Core.Impl.Cache.Query.QueryCursorBase`1.GetEnumer= ator() > > =C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache.BuildRtProposalFrom= OwProposal(String > enterprise, String route1, String route2, DateTime departureDate, DateTim= e > returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf) dans > C:\DevRoot\A.Ignite\MrFly.CacheDlm.Common\Services\ServiceCache.cs:ligne > 1015 > > =C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache.BuildRtProposal(Str= ing > enterprise, String route1, String route2, DateTime departureDate, DateTim= e > returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf) dans > C:\DevRoot\A.Ignite\MrFly.CacheDlm.Common\Services\ServiceCache.cs:ligne > 913 > > =C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache.GetRtProposal(Strin= g > enterprise, String route1, String route2, DateTime departureDate, DateTim= e > returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf, Boolean withCache) dan= s > C:\DevRoot\A.Ignite\MrFly.CacheDlm.Common\Services\ServiceCache.cs:ligne > 646 > > > If I stop Ignite on this node, the cluster starts responding again. > > > So my questions are : why the node stops responding? How to identify root > causes? > > > Regards, > > Stephane Gayet > > > > ------------------------------ > *De :* St=C3=A9phane Gayet > *Envoy=C3=A9 :* mercredi 30 mai 2018 23:44 > *=C3=80 :* user@ignite.apache.org > *Objet :* SQL Query error > > > Hi all, > > > We have installed a 2.4 Ignite cluster > > - 3 nodes under Windows systems (24 Go, 16 Go, 16 Go) > > - 4 caches configured, partitioned, no backup > > - no persistence > > > cache @c0 contains around 60,000 items, > > cache @c3 contains few items (around 200) but items are very large. > > > We run sql queries which aggregate the @c0 items in large collections > (until 14,000 items per collection) and store the result in @c3. > > > After a while, the sql query stop functionning. The following error is > logged : > > > 2018-05-30 23:14:42,716 [282] ERROR org.apache.ignite.internal.pro > cessors.query.h2.twostep.GridMapQueryExecutor - Failed to execute local > query. > > > Here is our cache configuration : > > > backups=3D"0" readThrough=3D"true" writeThrough=3D"true" > writeBehindEnabled=3D"true"> > > > valueTypeName=3D"Models.OwProposal"/> > > > backups=3D"0" readThrough=3D"true" writeThrough=3D"true" > writeBehindEnabled=3D"true"> > > > valueTypeName=3D"Models.RtProposal"/> > > > backups=3D"0"> > > valueTypeName=3D"Models.OwCollection"/> > > > backups=3D"0"> > > valueTypeName=3D"Models.RtCollection"/> > > > > > > I tried to clear the items of @c0 cache before re-populate it but I got > the error : > > 2018-05-30 23:34:34,333 [16] ERROR ServiceCache - Failed to delete OwItem= s > older than 2018-05-31 > Apache.Ignite.Core.Common.IgniteException: Failed to execute map query on > the node: b9f240d6-0ee6-4dee-bda0-51088a743481, class > org.apache.ignite.internal.processors.query.IgniteSQLException:Failed to > set schema for DB connection for thread [schema=3Dowproposals] ---> > Apache.Ignite.Core.Common.JavaException: class > org.apache.ignite.IgniteCheckedException: Failed to execute map query on > the node: b9f240d6-0ee6-4dee-bda0-51088a743481, class > org.apache.ignite.internal.processors.query.IgniteSQLException:Failed to > set schema for DB connection for thread [schema=3Dowproposals] > at org.apache.ignite.internal.processors.platform.utils.Platfor > mUtils.unwrapQueryException(PlatformUtils.java:519) > at org.apache.ignite.internal.processors.platform.cache.query.P > latformAbstractQueryCursor.processOutStream(PlatformAbstract > QueryCursor.java:132) > at org.apache.ignite.internal.processors.platform.PlatformTarge > tProxyImpl.outStream(PlatformTargetProxyImpl.java:93) > Caused by: javax.cache.CacheException: Failed to run map query > remotely.Failed to execute map query on the node: > b9f240d6-0ee6-4dee-bda0-51088a743481, class org.apache.ignite.internal.pr= o > cessors.query.IgniteSQLException:Failed to set schema for DB connection > for thread [schema=3Dowproposals] > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.query(GridReduceQueryExecutor.java:747) > at org.apache.ignite.internal.processors.query.h2.IgniteH2Index > ing$8.iterator(IgniteH2Indexing.java:1339) > at org.apache.ignite.internal.processors.cache.QueryCursorImpl. > iterator(QueryCursorImpl.java:95) > at org.apache.ignite.internal.processors.query.h2.IgniteH2Index > ing$9.iterator(IgniteH2Indexing.java:1403) > at org.apache.ignite.internal.processors.cache.QueryCursorImpl. > iterator(QueryCursorImpl.java:95) > at org.apache.ignite.internal.processors.cache.QueryCursorImpl. > getAll(QueryCursorImpl.java:127) > at org.apache.ignite.internal.processors.platform.cache.query.P > latformAbstractQueryCursor.processOutStream(PlatformAbstract > QueryCursor.java:127) > ... 1 more > Caused by: javax.cache.CacheException: Failed to execute map query on the > node: b9f240d6-0ee6-4dee-bda0-51088a743481, class > org.apache.ignite.internal.processors.query.IgniteSQLException:Failed to > set schema for DB connection for thread [schema=3Dowproposals] > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.fail(GridReduceQueryExecutor.java:275) > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.onFail(GridReduceQueryExecutor.java:265) > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor.onMessage(GridReduceQueryExecutor.java:244) > at org.apache.ignite.internal.processors.query.h2.twostep.GridR > educeQueryExecutor$2.onMessage(GridReduceQueryExecutor.java:188) > at org.apache.ignite.internal.managers.communication.GridIoMana > ger$ArrayListener.onMessage(GridIoManager.java:2332) > at org.apache.ignite.internal.managers.communication.GridIoMana > ger.invokeListener(GridIoManager.java:1555) > at org.apache.ignite.internal.managers.communication.GridIoMana > ger.processRegularMessage0(GridIoManager.java:1183) > at org.apache.ignite.internal.managers.communication.GridIoMana > ger.access$4200(GridIoManager.java:126) > at org.apache.ignite.internal.managers.communication.GridIoMana > ger$9.run(GridIoManager.java:1090) > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > > =C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.ExceptionCheck() > =C3=A0 Apache.Ignite.Core.Impl.Unmanaged.UnmanagedUtils.TargetOutStrea= m(GlobalRef > target, Int32 opType, Int64 memPtr) > =C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.OutStream[T](Int32 ty= pe, > Func`2 readAction) > --- Fin de la trace de la pile d'exception interne --- > =C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.OutStream[T](Int32 ty= pe, > Func`2 readAction) > =C3=A0 Apache.Ignite.Core.Impl.Cache.Query.QueryCursorBase`1.GetAll() > =C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache.GetKeys[TV](QueryBa= se > query) > =C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache.DeleteOneWayItems(D= ateTime > createdDate) > > > At this time, the only way is to down the three nodes and restart them > from scratch. > > Any idea about what is malfunctionning or misconfigured ? > > Kind regards, > > S Gayet > > > > > > --0000000000003f1e66056d7f3e88 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello!

It looks like that yo= ur cluster was in hung state (unable to perform partition map exchange when= client node exited or entries topology) due to a stuck cache operation.

As witnessed by:
2018-05-31 12:10:37,781 [= 35] WARN org.apache.ignite.internal.diagnostic - Failed to wait for partiti= on release future [topVer=3DAffinityTopologyVersion [topVer=3D70
, minor= TopVer=3D0], node=3D646cb075-dd64-4a90-a5a8-23f3b97f4d36]
2018-05-31 12:= 10:37,781 [35] WARN org.apache.ignite.internal.processors.cache.distributed= .dht.preloader.GridDhtPartitionsExchangeFuture - Partition release futu
= re: PartitionReleaseFuture [topVer=3DAffinityTopologyVersion [topVer=3D70, = minorTopVer=3D0], futures=3D[ExplicitLockReleaseFuture [topVer=3DAffinityTo= pologyVersion [topVe
r=3D70, minorTopVer=3D0], futures=3D[]], TxReleaseF= uture [topVer=3DAffinityTopologyVersion [topVer=3D70, minorTopVer=3D0], fut= ures=3D[]], AtomicUpdateReleaseFuture [topVer=3DAf
finityTopologyVersion= [topVer=3D70, minorTopVer=3D0], futures=3D[GridDhtAtomicUpdateFuture [upda= teCntr=3D7, super=3DGridDhtAtomicAbstractUpdateFuture [futId=3D85, resCnt= =3D0,
=C2=A0addedReader=3Dfalse, dhtRes=3D{}]]]], DataStreamerReleaseFut= ure [topVer=3DAffinityTopologyVersion [topVer=3D70, minorTopVer=3D0], futur= es=3D[]]]]
...
2018-05-31 12:10:37,781 [35] WARN org.ap= ache.ignite.internal.diagnostic - Pending atomic cache futures:
2018-05-= 31 12:10:37,781 [35] WARN org.apache.ignite.internal.diagnostic - >>&= gt; GridDhtAtomicUpdateFuture [updateCntr=3D7, super=3DGridDhtAtomicAbstrac= tUpdateFuture [futId=3D85, resCnt=3D0, addedReader=3Dfalse, dhtRes=3D{}]]

Once you have killed one of the nodes, the oper= ation was no longer considered stuck and your cluster un-hung.
It's hard to say why this operation failed to complete. I = suggest you to collect Java thread dumps (with jstack) from all nodes on th= e nearest occassion when you notice that cluster is stuck again.
=
Regards,



--
Ilya Kasnacheev

2018-05-31 14:01 GMT+03:00 St=C3=A9phane Gay= et <Stephane.Gayet@misterfly.com>:

Hi Ilya,


Thanks for your help.


Could you access to the files at= =C2=A0https://drive.googl= e.com/drive/folders/1apPraUn-Z2wKXFr5Wsdf8qHIQssW0XgC?usp=3D= sharing


Regards,


De : I= lya Kasnacheev <ilya.kasnacheev@gmail.com>
Envoy=C3=A9 : jeudi 31 mai 2018 12:32:35
=C3=80 : user@ignite.apache.org
Objet : Re: SQL Query error
=C2=A0
Hello!

Full Ignite logs of the problematic node will be helpful. Can you uplo= ad the log file anywhere?

Regards,

--
Ilya Kasnacheev

2018-05-31 12:37 GMT+03:0= 0 St=C3=A9phane Gayet <Steph= ane.Gayet@misterfly.com>:

Hi All,


Corrections about my previous ema= il.


When the cluster stops responding= to the sql query, I can identify a faulting node in the following exceptio= n :

2018-05-31 11:20:17,036 [75] ERROR ServiceCache - Failed to get RtPr= oposal

Apache.Ignite.Core.Cache.CacheException: Failed to run map quer= y remotely.Failed to execute map query on the node: 90bd677d-dee5-44bb-af6f-80786b85bd37, class org.apache.ignite.internal.processo= rs.query.IgniteSQLException:Failed to set schema for DB connection for= thread [schema=3Dowproposals] ---> Apache.Ignite.Core.Common.JavaE= xception: javax.cache.CacheException: Failed to run map query remotely.Failed to execute map query on the node: = 90bd677d-dee5-44bb-af6f-80786b85bd37, class org.apache.ignite.internal.processors.quer= y.IgniteSQLException:Failed to set schema for DB connection for thread= [schema=3Dowproposals]

at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:74= 7)

at org.apache.ignite.internal.processors.query.h2.Igni= teH2Indexing$8.iterator(IgniteH2Indexing.java:1339)

at org.apache.ignite.internal.processors.cache.QueryCu= rsorImpl.iterator(QueryCursorImpl.java:95)

at org.apache.ignite.internal.processors.platform.cach= e.query.PlatformAbstractQueryCursor.processInLongOutLong(Platform= AbstractQueryCursor.java:147)

at org.apache.ignite.internal.processors.platform.Plat= formTargetProxyImpl.inLongOutLong(PlatformTargetProxyImpl.java:55= )

Caused by: javax.cache.CacheException: Failed to execute map query o= n the node: 90bd677d-dee5-44bb-af6f-80786b85bd37, class org.apache.ignite.internal.pro= cessors.query.IgniteSQLException:Failed to set schema for DB connection for thread [schema=3Dowproposals]

at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.fail(GridReduceQueryExecutor.java:275= )

at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.onFail(GridReduceQueryExecutor.java:2= 65)

at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.onMessage(GridReduceQueryExecutor.jav= a:244)

at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor$2.onMessage(GridReduceQueryExecutor.j= ava:188)

at org.apache.ignite.internal.man= agers.communication.GridIoManager$ArrayListener.onMessage(GridIoM= anager.java:2332)

at org.apache.ignite.internal.man= agers.communication.GridIoManager.invokeListener(GridIoManager.ja= va:1555)

at org.apache.ignite.internal.man= agers.communication.GridIoManager.processRegularMessage0(GridIoMa= nager.java:1183)

at org.apache.ignite.internal.man= agers.communication.GridIoManager.access$4200(GridIoManager.java:= 126)

at org.apache.ignite.internal.man= agers.communication.GridIoManager$9.run(GridIoManager.java:1090)<= /span>

at java.util.concurrent.ThreadPoo= lExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoo= lExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source= )


=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.E= xceptionCheck()

=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.C= allLongMethod(GlobalRef obj, IntPtr methodId, Int64* argsPtr)

=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Unmanaged= Utils.TargetInLongOutLong(GlobalRef target, Int32 opType, Int64 memPtr= )

=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.I= nLongOutLong(Int32 type, Int64 val)

=C2=A0 =C2=A0--- Fin de la trace de la pile d'exception interne = ---

=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.I= nLongOutLong(Int32 type, Int64 val)

=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.Cache.Query.QueryCu= rsorBase`1.GetEnumerator()

=C2=A0 =C2=A0=C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache= .BuildRtProposalFromOwProposal(String enterprise, String route1, Strin= g route2, DateTime departureDate, DateTime returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf) dans C:\DevRoot\A.Ignite\MrFly.Cach= eDlm.Common\Services\ServiceCache.cs:ligne 1015

=C2=A0 =C2=A0=C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache= .BuildRtProposal(String enterprise, String route1, String route2, Date= Time departureDate, DateTime returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf) dans C:\DevRoot\A.Ignite\MrFly.CacheDlm.Common\Se= rvices\ServiceCache.cs:ligne 913

=C2=A0 =C2=A0=C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache= .GetRtProposal(String enterprise, String route1, String route2, DateTi= me departureDate, DateTime returnDate, Int32 nbAdt, Int32 nbChd, Int32 nbInf, Boolean withCache) dans C:\DevRoot\A.Ignite\MrFly.Cach= eDlm.Common\Services\ServiceCache.cs:ligne 646


If I stop Ignite on this node, the cluster starts responding again.


So my questions are : why the nod= e stops responding? How to identify root causes?


Regards,

Stephane Gayet




De : St=C3=A9phane Gayet <Stephane.Gayet@misterfly.com>= ;
Envoy=C3=A9 : mercredi 30 mai 2018 23:44
=C3=80 : user@ignite.apache.org
Objet : SQL Query error
=C2=A0

Hi all,


We have installed a 2.4=C2=A0Igni= te cluster

- 3 nodes under Windows systems (= 24 Go, 16 Go, 16 Go)

- 4 caches configured, partitione= d, no backup

- no persistence


cache @c0 contains around 60,000 = items,=C2=A0

cache @c3 contains few items (aro= und 200) but items are=C2=A0very large.


We run sql queries which aggregat= e the @c0 items in large collections (until 14,000 items per collection) an= d store the result in @c3.


After a while, the=C2=A0sql query= stop functionning. The following error is logged :


2018-05-30 23:14:42,716 [282] ERROR org.apache.ignite.internal.processors.query.h2.t= wostep.GridMapQueryExecutor - Failed to execute local query.
=C2=A0 =C2=A0 <cacheConfiguration>
=C2=A0 =C2=A0 =C2=A0 <cacheConfiguration name=3D"owproposals= " cacheMode=3D"Partitioned" backups=3D"0" readThro= ugh=3D"true" writeThrough=3D"true" writeBehindEnabled= =3D"true">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <cacheStoreFactory type=3D"OwPro= posalFactory"/>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntities>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntity valueType=3D"= ;OwProposal, Cache.Common"
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0valueTypeName=3D"Models.OwProposal"/>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </queryEntities>
=C2=A0 =C2=A0 =C2=A0 </cacheConfiguration>
=C2=A0 =C2=A0 =C2=A0 <cacheConfiguration name=3D"rtproposals= " cacheMode=3D"Partitioned" backups=3D"0" readThro= ugh=3D"true" writeThrough=3D"true" writeBehindEnabled= =3D"true">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <cacheStoreFactory type=3D"RtPro= posalFactory"/>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntities>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntity valueType=3D"= ;RtProposal, Cache.Common"
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0valueTypeName=3D"Models.RtProposal"/>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </queryEntities>
=C2=A0 =C2=A0 =C2=A0 </cacheConfiguration>
=C2=A0 =C2=A0 =C2=A0 <cacheConfiguration name=3D"owcollectio= n" cacheMode=3D"Partitioned" backups=3D"0">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntities>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntity valueType=3D"= ;OwCollection, Cache.Common"
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0valueTypeName=3D"Models.OwCollection"/><= /span>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </queryEntities>
=C2=A0 =C2=A0 =C2=A0 </cacheConfiguration>
=C2=A0 =C2=A0 =C2=A0 <cacheConfiguration name=3D"rtcollectio= n" cacheMode=3D"Partitioned" backups=3D"0">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntities>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <queryEntity valueType=3D"= ;RtCollection, Cache.Common"
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0valueTypeName=3D"Models.RtCollection"/><= /span>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </queryEntities>
=C2=A0 =C2=A0 =C2=A0 </cacheConfiguration>
=C2=A0 =C2=A0 </cacheConfiguration>
=

I tried to clear the items of @c0 cache before re-populate it but I go= t the error :

2018-05-30 23:34:34,333 [16] ERROR ServiceCache - Failed to delete OwItems older than 2018-05-31
Apache.Ignite.Core.Common.IgniteException: Failed to execute ma= p query on the node: b9f240d6-0ee6-4dee-bda0-51088a743481, class org.apache.ignite.internal.pro= cessors.query.IgniteSQLException:Failed to set schema for DB connection for thread [schema=3Dow<= span style=3D"font-family:"Courier New",monospace;font-size:10pt"= >proposals] ---> Apache.Ignite.Core.Common.JavaException: class org.apache.ignite.IgniteCheckedException: Failed to execute map= query on the node: b9f240d6-0ee6-4dee-bda0-51088a743481, class org.apache.ignite.internal.pro<= /a>cessors.query.IgniteSQLException:Failed to set schema for DB c= onnection for thread [schema=3Dowproposals]
at org.apache.ignite.internal.processors.platform.util= s.PlatformUtils.unwrapQueryException(PlatformUtils.java:519)
at org.apache.ignite.internal.processors.platform.cach= e.query.PlatformAbstractQueryCursor.processOutStream(PlatformAbst= ractQueryCursor.java:132)
at org.apache.ignite.internal.processors.platform.Plat= formTargetProxyImpl.outStream(PlatformTargetProxyImpl.java:93)
Caused by: javax.cache.CacheException: Failed to run map query remot= ely.Failed to execute map query on the node: b9f240d6-0ee6-4dee-bda0-51088a= 743481, class org.ap= ache.ignite.internal.processors.query.IgniteSQLException:Fail= ed to set schema for DB connection for thread [schema=3Dow<= span style=3D"font-family:"Courier New",monospace;font-size:10pt"= >proposals]
at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.query(GridReduceQueryExecutor.java:74= 7)
at org.apache.ignite.internal.processors.query.h2.Igni= teH2Indexing$8.iterator(IgniteH2Indexing.java:1339)
at org.apache.ignite.internal.processors.cache.QueryCu= rsorImpl.iterator(QueryCursorImpl.java:95)
at org.apache.ignite.internal.processors.query.h2.Igni= teH2Indexing$9.iterator(IgniteH2Indexing.java:1403)
at org.apache.ignite.internal.processors.cache.QueryCu= rsorImpl.iterator(QueryCursorImpl.java:95)
at org.apache.ignite.internal.processors.cache.QueryCu= rsorImpl.getAll(QueryCursorImpl.java:127)
at org.apache.ignite.internal.processors.platform.cach= e.query.PlatformAbstractQueryCursor.processOutStream(PlatformAbst= ractQueryCursor.java:127)
... 1 more
Caused by: javax.cache.CacheException: Failed to execute map query o= n the node: b9f240d6-0ee6-4dee-bda0-51088a743481, class org.apache.ignite.internal.pro= cessors.query.IgniteSQLException:Failed to set schema for DB connection for thread [schema=3Dow<= span style=3D"font-family:"Courier New",monospace;font-size:10pt"= >proposals]
at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.fail(GridReduceQueryExecutor.java:275= )
at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.onFail(GridReduceQueryExecutor.java:2= 65)
at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor.onMessage(GridReduceQueryExecutor.jav= a:244)
at org.apache.ignite.internal.processors.query.h2.twos= tep.GridReduceQueryExecutor$2.onMessage(GridReduceQueryExecutor.j= ava:188)
at org.apache.ignite.internal.man= agers.communication.GridIoManager$ArrayListener.onMessage(GridIoM= anager.java:2332)
at org.apache.ignite.internal.man= agers.communication.GridIoManager.invokeListener(GridIoManager.ja= va:1555)
at org.apache.ignite.internal.man= agers.communication.GridIoManager.processRegularMessage0(GridIoMa= nager.java:1183)
at org.apache.ignite.internal.man= agers.communication.GridIoManager.access$4200(GridIoManager.java:= 126)
at org.apache.ignite.internal.man= agers.communication.GridIoManager$9.run(GridIoManager.java:1090)<= /span>
at java.util.concurrent.ThreadPoo= lExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoo= lExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source= )

=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.E= xceptionCheck()
=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.Unmanaged.Unmanaged= Utils.TargetOutStream(GlobalRef target, Int32 opType, Int64 memPtr)
=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.O= utStream[T](Int32 type, Func`2 readAction)
=C2=A0 =C2=A0--- Fin de la trace de la pile d'exception interne = ---
=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.PlatformJniTarget.O= utStream[T](Int32 type, Func`2 readAction)
=C2=A0 =C2=A0=C3=A0 Apache.Ignite.Core.Impl.Cache.Query.QueryCu= rsorBase`1.GetAll()
=C2=A0 =C2=A0=C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache= .GetKeys[TV](QueryBase query)
=C2=A0 =C2=A0=C3=A0 MrFly.CacheDlm.Common.Services.ServiceCache= .DeleteOneWayItems(DateTime createdDate)

At this time, the only way is to down the three nodes and restart them from= scratch.

Any idea about what is malfunctionning or misconfigured ?

Kind regards,

S Gayet






--0000000000003f1e66056d7f3e88--