From dev-return-320946-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Thu May 3 14:56:43 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9500B180625 for ; Thu, 3 May 2018 14:56:42 +0200 (CEST) Received: (qmail 62624 invoked by uid 500); 3 May 2018 12:56:41 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 62614 invoked by uid 99); 3 May 2018 12:56:41 -0000 Received: from mail-relay.apache.org (HELO mailrelay2-lw-us.apache.org) (207.244.88.137) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 May 2018 12:56:41 +0000 Received: from mail-io0-f176.google.com (mail-io0-f176.google.com [209.85.223.176]) by mailrelay2-lw-us.apache.org (ASF Mail Server at mailrelay2-lw-us.apache.org) with ESMTPSA id 24D0C4076 for ; Thu, 3 May 2018 12:56:39 +0000 (UTC) Received: by mail-io0-f176.google.com with SMTP id e78-v6so21516020iod.0 for ; Thu, 03 May 2018 05:56:39 -0700 (PDT) X-Gm-Message-State: ALQs6tB9U46ErxzEOhe682FYzKrYiralHlV+VtfvWav4auH1Ll9EdHRY 3QdWpJoGoEXI64XfYMAVleQ8qYlhn6athKRMIhk= X-Google-Smtp-Source: AB8JxZqCrdE4ddFlrwDEo1nIvVNHvP29ixuDTHiocwt4+VUWltemOJ1YjBolcH5xrt1BCb/ov9Tq7AHoqLWxXp13Z54= X-Received: by 2002:a6b:2b87:: with SMTP id r129-v6mr24374266ior.177.1525352199236; Thu, 03 May 2018 05:56:39 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:390d:0:0:0:0:0 with HTTP; Thu, 3 May 2018 05:56:38 -0700 (PDT) In-Reply-To: References: <498897085.155.1525277585195.JavaMail.jenkins@serv1.sd-datasolutions.de> <854415223.173.1525347865132.JavaMail.jenkins@serv1.sd-datasolutions.de> From: Mikhail Khludnev Date: Thu, 3 May 2018 15:56:38 +0300 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [JENKINS] Lucene-Solr-7.x-Windows (32bit/jdk1.8.0_144) - Build # 575 - Still Unstable! To: Dawid Weiss Cc: dev@lucene.apache.org Content-Type: multipart/alternative; boundary="000000000000fe78c6056b4cbbc6" --000000000000fe78c6056b4cbbc6 Content-Type: text/plain; charset="UTF-8" I'll push that stop condition under https://issues.apache.org/jira/browse/SOLR-12307 On Thu, May 3, 2018 at 3:26 PM, Dawid Weiss wrote: > I honestly don't know (don't know much about zookeeper). I think there > should be some kind of action to this unrecoverable situation rather > than an endless loop :) Your patch looks good to me, but I really > don't know much about that fragment of the code. > > Dawid > > On Thu, May 3, 2018 at 2:12 PM, Mikhail Khludnev wrote: > > I have the fix just for this spin in > > https://issues.apache.org/jira/secure/attachment/ > 12919074/SOLR-12200.patch > > (Although I abandoned SOLR-12200) > > > > diff --git > > a/solr/core/src/java/org/apache/solr/cloud/autoscaling/ > OverseerTriggerThread.java > > b/solr/core/src/java/org/apache/solr/cloud/autoscaling/ > OverseerTriggerThread.java > > index ece4c4c..5cb1f90 100644 > > --- > > a/solr/core/src/java/org/apache/solr/cloud/autoscaling/ > OverseerTriggerThread.java > > +++ > > b/solr/core/src/java/org/apache/solr/cloud/autoscaling/ > OverseerTriggerThread.java > > @@ -142,8 +142,14 @@ public class OverseerTriggerThread implements > Runnable, > > SolrCloseable { > > Thread.currentThread().interrupt(); > > log.warn("Interrupted", e); > > break; > > - } catch (IOException | KeeperException e) { > > + } > > + catch (IOException | KeeperException e) { > > log.error("A ZK error has occurred", e); > > + if (e.getCause()!=null && e.getCause() instanceof > > KeeperException.SessionExpiredException) { > > + log.warn("Solr cannot talk to ZK, exiting " + > > + getClass().getSimpleName() + " main queue loop", e); > > + return; > > + } > > } > > } > > > > > > I can push only this, just to stop torture Jenkins. WDYT ? > > > > On Thu, May 3, 2018 at 2:57 PM, Dawid Weiss > wrote: > >> > >> Endless loop (session expired): > >> > >> [junit4] 2> 1992793 ERROR > >> > >> (OverseerAutoScalingTriggerThread-72097539512664067-127.0.0. > 1:8983_solr-n_0000000001) > >> [ ] o.a.s.c.a.OverseerTriggerThread A ZK error has occurre > >> d > >> [junit4] 2> java.io.IOException: > >> org.apache.zookeeper.KeeperException$SessionExpiredException: > >> KeeperErrorCode = Session expired for /autoscaling.json > >> [junit4] 2> at > >> > >> org.apache.solr.client.solrj.impl.ZkDistribStateManager. > getAutoScalingConfig(ZkDistribStateManager.java:183) > >> ~[java/:?] > >> [junit4] 2> at > >> > >> org.apache.solr.client.solrj.cloud.DistribStateManager. > getAutoScalingConfig(DistribStateManager.java:83) > >> ~[java/:?] > >> [junit4] 2> at > >> > >> org.apache.solr.cloud.autoscaling.OverseerTriggerThread.run( > OverseerTriggerThread.java:131) > >> [java/:?] > >> [junit4] 2> at java.lang.Thread.run(Thread.java:748) > >> [?:1.8.0_144] > >> [junit4] 2> Caused by: > >> org.apache.zookeeper.KeeperException$SessionExpiredException: > >> KeeperErrorCode = Session expired for /autoscaling.json > >> [junit4] 2> at > >> org.apache.zookeeper.KeeperException.create(KeeperException.java:130) > >> ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0] > >> [junit4] 2> at > >> org.apache.zookeeper.KeeperException.create(KeeperException.java:54) > >> ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0] > >> [junit4] 2> at > >> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1215) > >> ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c32e93c01a0] > >> [junit4] 2> at > >> > >> org.apache.solr.common.cloud.SolrZkClient.lambda$getData$5( > SolrZkClient.java:340) > >> ~[java/:?] > >> [junit4] 2> at > >> > >> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation( > ZkCmdExecutor.java:60) > >> ~[java/:?] > >> [junit4] 2> at > >> org.apache.solr.common.cloud.SolrZkClient.getData( > SolrZkClient.java:340) > >> ~[java/:?] > >> [junit4] 2> at > >> > >> org.apache.solr.client.solrj.impl.ZkDistribStateManager. > getAutoScalingConfig(ZkDistribStateManager.java:176) > >> ~[java/:?] > >> [junit4] 2> ... 3 more > >> > >> > >> On Thu, May 3, 2018 at 1:37 PM, Policeman Jenkins Server > >> wrote: > >> > Error processing tokens: Error while parsing action > >> > 'Text/ZeroOrMore/FirstOf/Token/DelimitedToken/DelimitedToken_Action3' > at > >> > input position (line 79, pos 4): > >> > )"} > >> > ^ > >> > > >> > java.lang.OutOfMemoryError: Java heap space > >> > > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > >> > For additional commands, e-mail: dev-help@lucene.apache.org > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > >> For additional commands, e-mail: dev-help@lucene.apache.org > >> > > > > > > > > -- > > Sincerely yours > > Mikhail Khludnev > -- Sincerely yours Mikhail Khludnev --000000000000fe78c6056b4cbbc6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I'll push that stop condition under=C2=A0


On Thu, May 3, 2018 at 3:26 PM= , Dawid Weiss <dawid.weiss@gmail.com> wrote:
I honestly don't know (don't know much about= zookeeper). I think there
should be some kind of action to this unrecoverable situation rather
than an endless loop :) Your patch looks good to me, but I really
don't know much about that fragment of the code.

Dawid

On Thu, May 3, 2018 at 2:12 PM, Mikhail Khludnev <mkhl@apache.org> wrote:
> I have the fix just for this spin in
> https://issues.apache= .org/jira/secure/attachment/12919074/SOLR-12200.patch
> (Although I abandoned SOLR-12200)
>
> diff --git
> a/solr/core/src/java/org/apache/solr/cloud/autoscaling/Overs= eerTriggerThread.java
> b/solr/core/src/java/org/apache/solr/cloud/autoscaling/Overs= eerTriggerThread.java
> index ece4c4c..5cb1f90 100644
> ---
> a/solr/core/src/java/org/apache/solr/cloud/autoscaling/Overs= eerTriggerThread.java
> +++
> b/solr/core/src/java/org/apache/solr/cloud/autoscaling/Overs= eerTriggerThread.java
> @@ -142,8 +142,14 @@ public class OverseerTriggerThread implements Run= nable,
> SolrCloseable {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Thread.currentThread().interrup= t();
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 log.warn("Interrupted", e)= ;
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 break;
> -=C2=A0 =C2=A0 =C2=A0 } catch (IOException | KeeperException e) {
> +=C2=A0 =C2=A0 =C2=A0 }
> +=C2=A0 =C2=A0 =C2=A0 catch (IOException | KeeperException e) {
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 log.error("A ZK error has occur= red", e);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (e.getCause()!=3Dnull && e.get= Cause() instanceof
> KeeperException.SessionExpiredException) {
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 log.warn("Solr cannot talk to= ZK, exiting " +
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 getClass().getSimple= Name() + " main queue loop", e);
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return;
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
>=C2=A0 =C2=A0 =C2=A0 }
>
>
> I can push only this, just to stop torture Jenkins. WDYT ?
>
> On Thu, May 3, 2018 at 2:57 PM, Dawid Weiss <dawid.weiss@gmail.com> wrote:
>>
>> Endless loop (session expired):
>>
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02> 1992793 ERROR
>>
>> (OverseerAutoScalingTriggerThread-72097539512664067-127.= 0.0.1:8983_solr-n_0000000001)
>> [=C2=A0 =C2=A0 ] o.a.s.c.a.OverseerTriggerThread A ZK error h= as occurre
>> d
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02> java.io.IOException:
>> org.apache.zookeeper.KeeperException$SessionExpiredExcep= tion:
>> KeeperErrorCode =3D Session expired for /autoscaling.json
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>>
>> org.apache.solr.client.solrj.impl.ZkDistribStateManager.= getAutoScalingConfig(ZkDistribStateManager.java:183)
>> ~[java/:?]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>>
>> org.apache.solr.client.solrj.cloud.DistribStateManager.g= etAutoScalingConfig(DistribStateManager.java:83)
>> ~[java/:?]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>>
>> org.apache.solr.cloud.autoscaling.OverseerTriggerThread.= run(OverseerTriggerThread.java:131)
>> [java/:?]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at java.lang.Thread.run(Thread.java:748)
>> [?:1.8.0_144]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02> Caused by:
>> org.apache.zookeeper.KeeperException$SessionExpiredExcep= tion:
>> KeeperErrorCode =3D Session expired for /autoscaling.json
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>> org.apache.zookeeper.KeeperException.create(KeeperExcept= ion.java:130)
>> ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c<= wbr>32e93c01a0]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>> org.apache.zookeeper.KeeperException.create(KeeperExcept= ion.java:54)
>> ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c<= wbr>32e93c01a0]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>> org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:12= 15)
>> ~[zookeeper-3.4.11.jar:3.4.11-37e277162d567b55a07d1755f0b31c<= wbr>32e93c01a0]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>>
>> org.apache.solr.common.cloud.SolrZkClient.lambda$getData$5(SolrZkClient.java:340)
>> ~[java/:?]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>>
>> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
>> ~[java/:?]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZk= Client.java:340)
>> ~[java/:?]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= at
>>
>> org.apache.solr.client.solrj.impl.ZkDistribStateManager.= getAutoScalingConfig(ZkDistribStateManager.java:176)
>> ~[java/:?]
>>=C2=A0 =C2=A0 [junit4]=C2=A0 =C2=A02>=C2=A0 =C2=A0 =C2=A0 =C2=A0= ... 3 more
>>
>>
>> On Thu, May 3, 2018 at 1:37 PM, Policeman Jenkins Server
>> <jenkins@thetaphi.de= > wrote:
>> > Error processing tokens: Error while parsing action
>> > 'Text/ZeroOrMore/FirstOf/Token/DelimitedToken/D= elimitedToken_Action3' at
>> > input position (line 79, pos 4):
>> > )"}
>> >=C2=A0 =C2=A0 ^
>> >
>> > java.lang.OutOfMemoryError: Java heap space
>> >
>> >
>> > --------------------------------------------------------= -------------
>> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: dev-help@lucene.apache.org
>>
>> ------------------------------------------------------------<= wbr>---------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev



--
=
Sincerely yours
Mikhail Khludnev
--000000000000fe78c6056b4cbbc6--