From solr-user-return-144765-archive-asf-public=cust-asf.ponee.io@lucene.apache.org  Thu Nov  1 14:14:20 2018
Return-Path: <solr-user-return-144765-archive-asf-public=cust-asf.ponee.io@lucene.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id B678D180652
	for <archive-asf-public@cust-asf.ponee.io>; Thu,  1 Nov 2018 14:14:19 +0100 (CET)
Received: (qmail 41027 invoked by uid 500); 1 Nov 2018 13:14:17 -0000
Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:solr-user-help@lucene.apache.org>
List-Unsubscribe: <mailto:solr-user-unsubscribe@lucene.apache.org>
List-Post: <mailto:solr-user@lucene.apache.org>
List-Id: <solr-user.lucene.apache.org>
Reply-To: solr-user@lucene.apache.org
Delivered-To: mailing list solr-user@lucene.apache.org
Received: (qmail 41013 invoked by uid 99); 1 Nov 2018 13:14:17 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Nov 2018 13:14:17 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 85D07C2C01
	for <solr-user@lucene.apache.org>; Thu,  1 Nov 2018 13:14:16 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: 1.898
X-Spam-Level: *
X-Spam-Status: No, score=1.898 tagged_above=-999 required=6.31
	tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
	DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001,
	SPF_PASS=-0.001] autolearn=disabled
Authentication-Results: spamd1-us-west.apache.org (amavisd-new);
	dkim=pass (2048-bit key) header.d=gmail.com
Received: from mx1-lw-us.apache.org ([10.40.0.8])
	by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024)
	with ESMTP id c4YroWqXFu-a for <solr-user@lucene.apache.org>;
	Thu,  1 Nov 2018 13:14:15 +0000 (UTC)
Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com [209.85.208.181])
	by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 8A6095F48F
	for <solr-user@lucene.apache.org>; Thu,  1 Nov 2018 13:14:14 +0000 (UTC)
Received: by mail-lj1-f181.google.com with SMTP id u6-v6so1668782ljd.1
        for <solr-user@lucene.apache.org>; Thu, 01 Nov 2018 06:14:14 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
        bh=nVtStkOJwYU3XqmwlL/uV4iDQRfNBzu4XIYOMUt3rFI=;
        b=uQ7AekrQdncnV5hLXChMlG9XDoc0wfZtSh74rb98oVPQrcM/TymzlQPox/2msVH45x
         Vi4Dp79WXGp/wQ+eYmL4BtFQJTqMKn41VkrWNpv/h1YUtPtVuaPgP02cPkgAyzsqyhd5
         xJ1R3N1eKSfro2OKTcmhkcp2dhjgjU6ltJWogWcqY6q7cIHu2UzNnnfH87XkGn9C6s9G
         UFt8whoyZeSjGLpk883iSMsOCxrKkVkoJq5Qh5g0yBtbJ1JLrcKrg/anuE5U20whYbMj
         htdcYib+gPv3VBNJxlvW/DQWyCsjpS4JH6gZlkwDu7nmuHIThIQ/3Jc9G45fCLYab3Gq
         6WOQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to;
        bh=nVtStkOJwYU3XqmwlL/uV4iDQRfNBzu4XIYOMUt3rFI=;
        b=EiMHjuzRaM9JY2YgYZSlm8hYC4P0l35JlBvvJAjqsLKqcAWBV8KgVm6XTIzNRqBRXQ
         RkEBo7nfTveXtgOcp+wwFlOfMhZvrnrxSs1YGXmi+SM2YUYq9ZxdejuJcA8GWVJyh6fz
         vXGJeb33Z9QfK0rJgfOlU5FiPTAJ1Y5Jbv0K52pzNqalDPCXLpuo680IzK9xPB+JouWe
         cxmUP1DoJsWIqc+pgvuiCzUm14JZ2R+x7gKMPzNMxgYua4OsFaOAw7PzuRqRJ2eSuOur
         V4tU/qvi0LgSyhMVYDtEC1KTtujRfmAtg0Rk9+3+qv3PltKiS19IerG75tplbSjw/5dW
         VjYw==
X-Gm-Message-State: AGRZ1gIG1T0EIY14Z4szUaOgR11mIwTivy2qUSXJ7EfJIeL9CKo1VyBe
	6MEZ74aukOxNBxDULO6+g3640heey93PTipH8j11LKex
X-Google-Smtp-Source: AJdET5eLVxmixTbLkuajMu4cwlbFc0wEYRGl0V7vJjTMgAd5IH4U7I3w+63URu5+ZtS240uf/iMr20TCn7CQxQRdeko=
X-Received: by 2002:a2e:e09:: with SMTP id 9-v6mr4738362ljo.159.1541078052183;
 Thu, 01 Nov 2018 06:14:12 -0700 (PDT)
MIME-Version: 1.0
References: <CAAf0OYeHm91FYCG_5kpVQqAisFEgF1RGoCUmiS7OHKGjJswGzw@mail.gmail.com>
 <BN7PR09MB28496E0A3AC0CA22CC8FE94A89F60@BN7PR09MB2849.namprd09.prod.outlook.com>
 <CAN4YXveHxW9X0T69dPD8G1UwxtYqrG=K7Z4AFQyMQkiKKOYZFw@mail.gmail.com>
In-Reply-To: <CAN4YXveHxW9X0T69dPD8G1UwxtYqrG=K7Z4AFQyMQkiKKOYZFw@mail.gmail.com>
From: Vidhya Kailash <vidhya.kailash@gmail.com>
Date: Thu, 1 Nov 2018 09:14:00 -0400
Message-ID: <CAAf0OYcZJ3V9xxLg_9LagRczYfaD942NTV__JCqcZiMi8hR-fw@mail.gmail.com>
Subject: Re: Solr cluster tuning
To: solr-user@lucene.apache.org
Content-Type: multipart/related; boundary="000000000000df652905799a31a6"

--000000000000df652905799a31a6
Content-Type: multipart/alternative; boundary="000000000000df652405799a31a5"

--000000000000df652405799a31a5
Content-Type: text/plain; charset="UTF-8"

Thank you Erick and Daniel for your prompt responses. We were trying a few
things (moving to G1GC, optimizing by throwing away some fields that need
not be indexed & stored) and hence the late response.

First of all, thought of giving a overview of the environment... We have a
four node Solr Cloud cluster. We have 2 indexes which is spread across 4
shards and has 2 replicas. We have a total of 30GB on each of the nodes
(all dedicated to running the Solr Cloud alone). Of which 15GB are
allocated to the JVM and the rest for the OS to manage. All the indexes
together take up just 1.4GB on the disk. Running version 7.4 with a
dedicated Zookeeper cluster.

Something of concern I see on the Solr Admin is the use of that memory.
[image: image.png]
this is what I see by running Top:
[image: image.png]

Is there a general calculation on how much to leave for OS caching for an
index of 2GB?
To answer Ericks question, no we are not indexing at the same time. In fact
we have stopped indexing just to test the theory and dont see any
improvements. I dont think I need to worry about autocommit then right?
Daniel, we did try what you mentioned here (that is warm up the cache and
then do a slow and a fast test) and we still see the slow test yielding
slower results.


Any thoughts anyone? Much appreciate your responses....


thanks
Vidhya


On Wed, Oct 24, 2018 at 6:40 PM Erick Erickson <erickerickson@gmail.com>
wrote:

> To add to Daniel's comments: Are you indexing at the same time? Say
> your autocommit time is 10 seconds. For the sake of argument let's say
> it takes 15 queries to warm your searcher. Let's further say that the
> average time for those 15 queries is 500ms each and once the searcher
> is warmed the average time drops to 100ms. You'll have an average
> close to 100ms.
>
> OTOH, if you only fire 15 queries over that 10 seconds, the average
> would be 500ms.
>
> My guess is your autowarm counts for filterCache and queryResult cache
> are the default 0 and if you set them to, say, 20 each much of your
> problem would disappear.  Ditto if you stopped indexing. Both point to
> the searchers having to pull data into memory from disk and/or rebuild
> caches.
>
> Best,
> Erick
> On Wed, Oct 24, 2018 at 1:37 PM Davis, Daniel (NIH/NLM) [C]
> <daniel.davis@nih.gov> wrote:
> >
> > Usually, responses are due to I/O waits getting the data off of the
> disk.   So, to me, this seems more likely because as you bombard the server
> with queries, you cause more and more of the data needed to answer the
> query into memory.
> >
> > To verify this, I'd bombard your server with queries to warm it up, and
> then repeat your test with the queries coming in slowly or quickly.
> >
> > If it still holds up, then there is something other than Solr going on
> with that server, and taking memory from Solr or your index is somewhat too
> big for your server.  Linux likes to overcommit memory - try setting vm
> swappiness to something low, like 10, rather than the default 60.   Look
> for anything on the server with Solr that may be competing with it for I/O
> resources, and causing its pages to swap out.
> >
> > Also, look at the size of your index data.
> >
> > These are general advises in dealing with inverted indexes - some of the
> Solr engineers on this list may have some very specific ideas, such as
> merging activity or other background tasks running when the query load is
> lighter.   I wouldn't know how to check for these things, but would thing
> they wouldn't affect query response time that badly.
> >
> > -----Original Message-----
> > From: Vidhya Kailash <vidhya.kailash@gmail.com>
> > Sent: Wednesday, October 24, 2018 4:22 PM
> > To: solr-user@lucene.apache.org
> > Subject: Solr cluster tuning
> >
> > We are currently using Solr Cloud Version 7.4 with SolrJ api to fetch
> data from collections. We recently deployed our code to production and
> noticed that response time is more if the number of incoming requests are
> less.
> >
> > But strangely, if we bombard the system with more and more requests we
> get much better response time.
> >
> > My suspicion is client is closing the connections sooner in case of
> slower requests and slower in case of faster requests.
> >
> > We tried tuning by passing custom HTTPClient to SolrJ and also by
> updating HttpShardHandlerFactory settings. For example we made -
> maxThreadIdleTime = 60000 socketTimeOut = 180000
> >
> > Wondering what other tuning we can do to make this perform the same
> irrespective of the number of requests.
> >
> > Thanks!
> >
> > Vidhya
>


-- 
Vidhya Kailash

--000000000000df652405799a31a5
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thank you Erick and Daniel for your prompt responses. We w=
ere trying a few things (moving to G1GC, optimizing by throwing away some f=
ields that need not be indexed &amp; stored) and hence the late response.=
=C2=A0<div><br></div><div>First of all, thought of giving a overview of the=
 environment... We have a four node Solr Cloud cluster. We have 2 indexes w=
hich is spread across 4 shards and has 2 replicas. We have a total of 30GB =
on each of the nodes (all dedicated to running the Solr Cloud alone). Of wh=
ich 15GB are allocated to the JVM and the rest for the OS to manage. All th=
e indexes together take up just 1.4GB on the disk. Running version 7.4 with=
 a dedicated Zookeeper cluster.</div><div><br></div><div>Something of conce=
rn I see on the Solr Admin is the use of that memory.=C2=A0</div><div><div>=
<img src=3D"cid:ii_jnym21fj2" alt=3D"image.png" width=3D"563" height=3D"349=
"><br></div></div><div>this is what I see by running Top:</div><div><div><i=
mg src=3D"cid:ii_jnylvho31" alt=3D"image.png" width=3D"563" height=3D"72"><=
br></div></div><div><br></div><div>Is there a general calculation on how mu=
ch to leave for OS caching for an index of 2GB?=C2=A0</div><div><div>To ans=
wer Ericks question, no we are not indexing at the same time. In fact we ha=
ve stopped indexing just to test the theory and dont see any improvements. =
I dont think I need to worry about autocommit then right?=C2=A0</div><div>D=
aniel, we did try what you mentioned here (that is warm up the cache and th=
en do a slow and a fast test) and we still see the slow test yielding slowe=
r results.=C2=A0</div></div><div><br></div><div><br></div><div>Any thoughts=
 anyone? Much appreciate your responses....</div><div><br></div><div><br></=
div><div>thanks</div><div>Vidhya</div><div><br></div></div><br><div class=
=3D"gmail_quote"><div dir=3D"ltr">On Wed, Oct 24, 2018 at 6:40 PM Erick Eri=
ckson &lt;<a href=3D"mailto:erickerickson@gmail.com">erickerickson@gmail.co=
m</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin=
:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">To add to Daniel&#=
39;s comments: Are you indexing at the same time? Say<br>
your autocommit time is 10 seconds. For the sake of argument let&#39;s say<=
br>
it takes 15 queries to warm your searcher. Let&#39;s further say that the<b=
r>
average time for those 15 queries is 500ms each and once the searcher<br>
is warmed the average time drops to 100ms. You&#39;ll have an average<br>
close to 100ms.<br>
<br>
OTOH, if you only fire 15 queries over that 10 seconds, the average<br>
would be 500ms.<br>
<br>
My guess is your autowarm counts for filterCache and queryResult cache<br>
are the default 0 and if you set them to, say, 20 each much of your<br>
problem would disappear.=C2=A0 Ditto if you stopped indexing. Both point to=
<br>
the searchers having to pull data into memory from disk and/or rebuild<br>
caches.<br>
<br>
Best,<br>
Erick<br>
On Wed, Oct 24, 2018 at 1:37 PM Davis, Daniel (NIH/NLM) [C]<br>
&lt;<a href=3D"mailto:daniel.davis@nih.gov" target=3D"_blank">daniel.davis@=
nih.gov</a>&gt; wrote:<br>
&gt;<br>
&gt; Usually, responses are due to I/O waits getting the data off of the di=
sk.=C2=A0 =C2=A0So, to me, this seems more likely because as you bombard th=
e server with queries, you cause more and more of the data needed to answer=
 the query into memory.<br>
&gt;<br>
&gt; To verify this, I&#39;d bombard your server with queries to warm it up=
, and then repeat your test with the queries coming in slowly or quickly.<b=
r>
&gt;<br>
&gt; If it still holds up, then there is something other than Solr going on=
 with that server, and taking memory from Solr or your index is somewhat to=
o big for your server.=C2=A0 Linux likes to overcommit memory - try setting=
 vm swappiness to something low, like 10, rather than the default 60.=C2=A0=
 =C2=A0Look for anything on the server with Solr that may be competing with=
 it for I/O resources, and causing its pages to swap out.<br>
&gt;<br>
&gt; Also, look at the size of your index data.<br>
&gt;<br>
&gt; These are general advises in dealing with inverted indexes - some of t=
he Solr engineers on this list may have some very specific ideas, such as m=
erging activity or other background tasks running when the query load is li=
ghter.=C2=A0 =C2=A0I wouldn&#39;t know how to check for these things, but w=
ould thing they wouldn&#39;t affect query response time that badly.<br>
&gt;<br>
&gt; -----Original Message-----<br>
&gt; From: Vidhya Kailash &lt;<a href=3D"mailto:vidhya.kailash@gmail.com" t=
arget=3D"_blank">vidhya.kailash@gmail.com</a>&gt;<br>
&gt; Sent: Wednesday, October 24, 2018 4:22 PM<br>
&gt; To: <a href=3D"mailto:solr-user@lucene.apache.org" target=3D"_blank">s=
olr-user@lucene.apache.org</a><br>
&gt; Subject: Solr cluster tuning<br>
&gt;<br>
&gt; We are currently using Solr Cloud Version 7.4 with SolrJ api to fetch =
data from collections. We recently deployed our code to production and noti=
ced that response time is more if the number of incoming requests are less.=
<br>
&gt;<br>
&gt; But strangely, if we bombard the system with more and more requests we=
 get much better response time.<br>
&gt;<br>
&gt; My suspicion is client is closing the connections sooner in case of sl=
ower requests and slower in case of faster requests.<br>
&gt;<br>
&gt; We tried tuning by passing custom HTTPClient to SolrJ and also by upda=
ting HttpShardHandlerFactory settings. For example we made - maxThreadIdleT=
ime =3D 60000 socketTimeOut =3D 180000<br>
&gt;<br>
&gt; Wondering what other tuning we can do to make this perform the same ir=
respective of the number of requests.<br>
&gt;<br>
&gt; Thanks!<br>
&gt;<br>
&gt; Vidhya<br>
</blockquote></div><br clear=3D"all"><div><br></div>-- <br><div dir=3D"ltr"=
 class=3D"gmail_signature" data-smartmail=3D"gmail_signature">Vidhya Kailas=
h</div>

--000000000000df652405799a31a5--
--000000000000df652905799a31a6--