From dev-return-328263-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Tue Jul 17 12:05:45 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 15BD0180600 for ; Tue, 17 Jul 2018 12:05:44 +0200 (CEST) Received: (qmail 19811 invoked by uid 500); 17 Jul 2018 10:05:43 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 19801 invoked by uid 99); 17 Jul 2018 10:05:43 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jul 2018 10:05:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C55EBC0B4A for ; Tue, 17 Jul 2018 10:05:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.28 X-Spam-Level: ** X-Spam-Status: No, score=2.28 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id aQJrju-GKeKL for ; Tue, 17 Jul 2018 10:05:41 +0000 (UTC) Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net [217.70.183.196]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 2C0C65F1E3 for ; Tue, 17 Jul 2018 10:05:40 +0000 (UTC) X-Originating-IP: 209.85.221.44 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (Authenticated sender: dbejean@eolya.fr) by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id 47EA2E002E for ; Tue, 17 Jul 2018 10:05:32 +0000 (UTC) Received: by mail-wr1-f44.google.com with SMTP id g6-v6so650108wrp.0 for ; Tue, 17 Jul 2018 03:05:32 -0700 (PDT) X-Gm-Message-State: AOUpUlEnYYpuSybS0tKs2IfPwR7feWmtWBYRB3mLAmbhTJavVuLcdYYo 98dxCMAZT0nNQNZIud8xO0Q1OvFtmKNL1JO8Iw== X-Google-Smtp-Source: AAOMgpd3a2I9oLhMaQXIA/LukTIl27gzf2onNjus2SnOs/1XUuEzV7rqharFvU4oQAGSDKA+TgzNncaUX76LVLgQ1Pk= X-Received: by 2002:adf:8405:: with SMTP id 5-v6mr867190wrf.167.1531821931866; Tue, 17 Jul 2018 03:05:31 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Dominique Bejean Date: Tue, 17 Jul 2018 12:05:19 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Solr and ZK timeout issues To: dev@lucene.apache.org Content-Type: multipart/alternative; boundary="0000000000001bac2b05712f1600" --0000000000001bac2b05712f1600 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, I sent this question to solr-user@lucene.apache.org and I think the good one is dev@lucene.apache.org, so I resend it. Sorry for this error. We are experimenting an issue related to Zk Timeout Stacktrace is : ERROR 19 juin 2018 06:24:07,152 - h.concurrent.ConcurrentService:67 - Erreur dans l'attente de la fin de l'ex=C3=A9cution d'un thread ERROR 19 juin 2018 06:24:07,152 - h.concurrent.ConcurrentService:68 - org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxx.xxx :2181 within 10000 ms ERROR 19 juin 2018 06:24:07,152 - api.batch.Lanceur:98 - org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxx.xxx :2181 within 10000 ms java.util.concurrent.ExecutionException: org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxx.xxx:2181 within 10000 ms at java.util.concurrent.FutureTask.report(FutureTask.java:122) ... Caused by: org.apache.solr.common.SolrException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxx.xxx:2181 within 10000 ms at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:182) at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:116) at org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:106) at org.apache.solr.common.cloud.ZkStateReader.(ZkStateReader.java:226) at org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.connect(ZkCl= ientClusterStateProvider.java:121) ... In solr.xml, we have : ${zkClientTimeout:30000} In Solr.in.sh, we have : #ZK_CLIENT_TIMEOUT=3D"15000" or ZK_CLIENT_TIMEOUT=3D"30000" So zkClientTimeout should be 30000. In source code of ZkClientClusterStateProvider.java, I see zkClientTimeout is hardcoded to 10000 ! Is it normal that configuration is not used ? lucene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ZkClientC= lusterStateProvider.java int zkConnectTimeout =3D 10000; int zkClientTimeout =3D 10000; ... zk =3D new ZkStateReader(zkHost, zkClientTimeout, zkConnectTimeout); Regards. Dominique --0000000000001bac2b05712f1600 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,=C2=A0

I sent this question to solr-user@lucene.apache.org= =C2=A0and I think the good one is=C2=A0dev@lucene.apache.org, so I resend it. Sorry for this error.
=C2=A0=C2=A0
= We are experimenting an issue related to Zk Timeout

Stacktrace is :

ERROR 1= 9 juin 2018 06:24:07,152 - h.concurrent.ConcurrentService:67=C2=A0 =C2=A0- = Erreur dans l'attente de la fin de l'ex=C3=A9cution d'un thread=
ERROR 19 juin 2018 06:24:07,152 = - h.concurrent.ConcurrentService:68=C2=A0 =C2=A0- org.apache.solr.common.So= lrException: java.util.concurrent.TimeoutException: Could not connect to Zo= oKeeper xxx.xxx.xxx.xxx=C2=A0 :2181 within 10000 ms
ERROR 19 juin 2018 06:24:07,152 -=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 api.batch.Lanceu= r:98=C2=A0 =C2=A0- org.apache.solr.common.SolrException: java.util.concurre= nt.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxx.xxx=C2=A0 := 2181 within 10000 ms
java.util.co= ncurrent.ExecutionException: org.apache.solr.common.SolrException: java.uti= l.concurrent.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxx.x= xx:2181 within 10000 ms
=C2=A0at = java.util.concurrent.FutureTask.report(FutureTask.java:122)
=C2=A0...
Caused by: org.apache.solr.common.SolrException: java.util.concurrent.T= imeoutException: Could not connect to ZooKeeper xxx.xxx.xxx.xxx:2181 within= 10000 ms
=C2=A0at org.apache.sol= r.common.cloud.SolrZkClient.<init>(SolrZkClient.java:182)
=C2=A0at org.apache.solr.common.cloud.SolrZk= Client.<init>(SolrZkClient.java:116)
=C2=A0at org.apache.solr.common.cloud.SolrZkClient.<init>(S= olrZkClient.java:106)
=C2=A0at or= g.apache.solr.common.cloud.ZkStateReader.<init>(ZkStateReader.java:22= 6)
=C2=A0at org.apache.solr.clien= t.solrj.impl.ZkClientClusterStateProvider.connect(ZkClientClusterStateProvi= der.java:121)
...


In solr.xml, we have :
=C2=A0 =C2=A0 <int name=3D"zkClientTimeout">= ${zkClientTimeout:30000}</int>

In Solr.in.sh, we have :=C2= =A0
#ZK_CLIENT_TIMEOUT=3D"15000&quo= t;
or
ZK_CLIENT_TIMEOUT= =3D"30000"

So zkClientTimeout=C2= =A0 should be 30000.

In source code of ZkClientClu= sterStateProvider.java, I see zkClientTimeout=C2=A0 is hardcoded to 10000 != Is it normal that configuration is not used ?

luc= ene-solr/solr/solrj/src/java/org/apache/solr/client/solrj/impl/ZkClientClus= terStateProvider.java

int= zkConnectTimeout =3D 10000;
int = zkClientTimeout =3D 10000;
=C2=A0=
...

zk =3D new Zk= StateReader(zkHost, zkClientTimeout, zkConnectTimeout);

Regards.

<= /div>
Dominique

--0000000000001bac2b05712f1600--