lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Ulicny <culi...@iq.media>
Subject Re: Solr index slow response
Date Tue, 19 Mar 2019 13:07:36 GMT
Do you know what is causing the 30% CPU usage? The seems awfully high for
solr when only indexing (unless it is iowait), but could be expected based
on custom update processors and analyzers you may have.

Are you putting all of the documents into a single core or multiple? Also
are you using SATA or PCIe solid state drives?

We have a similar index situation. About 65million documents that take up
about 4.5TB for a single copy of the index. We've never tried to put all of
the documents into a single solr core before, so I don't know how well that
would work.

On Tue, Mar 19, 2019 at 8:28 AM Aaron Yingcai Sun <YSun@vizrt.com> wrote:

> Yes, the same behavior even with a single thread client. The following
> page says "In general, adding many documents per update request is faster
> than one per update request."  but in reality, add many documents per
> request result in much longer response time, it's not liner, response time
> of 100 docs per request  is bigger than (the response time of 10 docs per
> request) * 10.
>
>
> https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor
>
> SolrPerformanceFactors - Solr Wiki<
> https://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor>
> wiki.apache.org
> Schema Design Considerations. indexed fields. The number of indexed fields
> greatly increases the following: Memory usage during indexing ; Segment
> merge time
>
>
>
>
> ________________________________
> From: Emir Arnautović <emir.arnautovic@sematext.com>
> Sent: Tuesday, March 19, 2019 1:00:19 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
>
> If you start indexing with just a single thread/client, do you still see
> slow bulks?
>
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 19 Mar 2019, at 12:54, Aaron Yingcai Sun <YSun@vizrt.com> wrote:
> >
> > "QTime" value is from the solr rest api response, extracted from the
> http/json payload.  The "Request time" is what I measured from client side,
> it's almost the same value as QTime, just some milliseconds difference.  I
> could provide tcpdump to prove that it is really solr slow response.
> >
> > Those long response time is not really spikes, it's constantly
> happening, almost half of the request has such long delay.  The more
> document added in one request the more delay it has.
> >
> > ________________________________
> > From: Emir Arnautović <emir.arnautovic@sematext.com>
> > Sent: Tuesday, March 19, 2019 12:30:33 PM
> > To: solr-user@lucene.apache.org
> > Subject: Re: Solr index slow response
> >
> > Just to add different perspective here: how do you send documents to
> Solr? Are those log lines from your client? Maybe it is not Solr that is
> slow. Could it be network or client itself. If you have some dry run on
> client, maybe try running it without Solr to eliminate client from the
> suspects.
> >
> > Do you observe similar spikes when you run indexing with less concurrent
> clients?
> >
> > It is really hard to pinpoint the issue without looking at some
> monitoring tool.
> >
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> >> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun <YSun@vizrt.com> wrote:
> >>
> >> We have around 80 million documents to index, total index size around
> 3TB,  I guess I'm not the first one to work with this big amount of data.
> with such slow response time, the index process would take around 2 weeks.
> While the system resource is not very loaded, there must be a way to speed
> it up.
> >>
> >>
> >> To Walter, I don't see why G1GC would improve this, we only do index,
> no query in the background. There is no memory constraint.  it's more feel
> like some internal thread are blocking each other.
> >>
> >>
> >> I used to run with more documents in one request, that give much worse
> response time, 300 documents in one request could end up 20 minutes
> response time, now I changed to max 10 documents in one request, still many
> response time around 30 seconds, while some of them are very fast( ~100
> ms).  How come there are such big difference? the documents size does not
> have such big difference.
> >>
> >>
> >> I just want to speed it up since nothing seems to be overloaded.  Are
> there any other faster way to index such big amount of data?
> >>
> >>
> >> BRs
> >>
> >> //Aaron
> >>
> >> ________________________________
> >> From: Walter Underwood <wunder@wunderwood.org>
> >> Sent: Monday, March 18, 2019 4:59:20 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: Solr index slow response
> >>
> >> Solr is not designed to have consistent response times for updates. You
> are expecting Solr to do something that it does not do.
> >>
> >> About Xms and Xmx, the JVM will continue to allocate memory until it
> hits the max. After it hits the max, it will start to collect garbage. A
> smaller Xms just wastes time doing allocations after the JVM is running.
> Avoid that by making Xms and Xms the same.
> >>
> >> We run all of our JVMs with 8 GB of heap and the G1 collector. You
> probably do not need more than 8 GB unless you are doing high-cardinality
> facets or some other memory-hungry querying.
> >>
> >> The first step would be to use a good configuration. We start our Java
> 8 JVMs with these parameters:
> >>
> >> SOLR_HEAP=8g
> >> # Use G1 GC  -- wunder 2017-01-23
> >> # Settings from https://wiki.apache.org/solr/ShawnHeisey
> >> GC_TUNE=" \
> >> -XX:+UseG1GC \
> >> -XX:+ParallelRefProcEnabled \
> >> -XX:G1HeapRegionSize=8m \
> >> -XX:MaxGCPauseMillis=200 \
> >> -XX:+UseLargePages \
> >> -XX:+AggressiveOpts \
> >> “
> >>
> >> Use SSD for disks, with total space about 3X as big as the expected
> index size.
> >>
> >> Have RAM not used by Solr or the OS that is equal to the expected index
> size.
> >>
> >> After that, let’s figure out what the real requirement is. If you must
> have consistent response times for update requests, you’ll need to do that
> outside of Solr. But if you need high data import rates, we can probably
> help.
> >>
> >> wunder
> >> Walter Underwood
> >> wunder@wunderwood.org
> >> http://observer.wunderwood.org/  (my blog)
> >>
> >>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <YSun@vizrt.com> wrote:
> >>>
> >>> Hello, Chris
> >>>
> >>>
> >>> Thanks for the tips.  So I tried to set it as you suggested, not see
> too much improvement.  Since I don't need it visible immediately,
> softCommit is disabled totally.
> >>>
> >>> The slow response is happening every few seconds,  if it happens
> hourly I would suspect the hourly auto-commit.  But it happen much more
> frequently.  I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS
> level.  It just looks like solr server is blocking internally itself.
> >>>
> >>>
> >>> <       <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime>
> >>> ---
> >>>>    <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
> >>> 16c16
> >>> <       <openSearcher>true</openSearcher>
> >>> ---
> >>>>    <openSearcher>false</openSearcher>
> >>>
> >>>
> >>>
> >>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539  KB, Res
> code: 200, QTime: 1405 ms, Request time: 1407 ms.
> >>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465  KB, Res
> code: 200, QTime: 1357 ms, Request time: 1360 ms.
> >>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473  KB, Res
> code: 200, QTime: 90 ms, Request time: 92 ms.
> >>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610  KB, Res
> code: 200, QTime: 5306 ms, Request time: 5308 ms.
> >>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508  KB, Res
> code: 200, QTime: 4775 ms, Request time: 4777 ms.
> >>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res
> code: 200, QTime: 20222 ms, Request time: 20224 ms.
> >>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539  KB, Res
> code: 200, QTime: 27732 ms, Request time: 27735 ms.
> >>> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616  KB, Res
> code: 200, QTime: 28106 ms, Request time: 28109 ms.
> >>> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473  KB, Res
> code: 200, QTime: 16703 ms, Request time: 16706 ms.
> >>> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540  KB, Res
> code: 200, QTime: 28216 ms, Request time: 28218 ms.
> >>> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673  KB, Res
> code: 200, QTime: 28133 ms, Request time: 28136 ms.
> >>> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631  KB, Res
> code: 200, QTime: 27909 ms, Request time: 27912 ms.
> >>> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529  KB, Res
> code: 200, QTime: 28172 ms, Request time: 28174 ms.
> >>> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602  KB, Res
> code: 200, QTime: 11375 ms, Request time: 11378 ms.
> >>> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530  KB, Res
> code: 200, QTime: 8663 ms, Request time: 8666 ms.
> >>> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553  KB, Res
> code: 200, QTime: 27526 ms, Request time: 27528 ms.
> >>> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600  KB, Res
> code: 200, QTime: 27529 ms, Request time: 27535 ms.
> >>> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661  KB, Res
> code: 200, QTime: 16617 ms, Request time: 16621 ms.
> >>> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549  KB, Res
> code: 200, QTime: 11707 ms, Request time: 11711 ms.
> >>> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618  KB, Res
> code: 200, QTime: 28315 ms, Request time: 28321 ms.
> >>> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493  KB, Res
> code: 200, QTime: 1320 ms, Request time: 1322 ms.
> >>> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631  KB, Res
> code: 200, QTime: 130 ms, Request time: 132 ms.
> >>> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563  KB, Res
> code: 200, QTime: 111 ms, Request time: 113 ms.
> >>> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554  KB, Res
> code: 200, QTime: 110 ms, Request time: 111 ms.
> >>> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590  KB, Res
> code: 200, QTime: 124 ms, Request time: 125 ms.
> >>> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583  KB, Res
> code: 200, QTime: 3772 ms, Request time: 3774 ms.
> >>> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550  KB, Res
> code: 200, QTime: 32802 ms, Request time: 32804 ms.
> >>> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586  KB, Res
> code: 200, QTime: 33787 ms, Request time: 33790 ms.
> >>> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572  KB, Res
> code: 200, QTime: 33397 ms, Request time: 33400 ms.
> >>> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498  KB, Res
> code: 200, QTime: 32917 ms, Request time: 32919 ms.
> >>> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629  KB, Res
> code: 200, QTime: 22207 ms, Request time: 22209 ms.
> >>> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584  KB, Res
> code: 200, QTime: 22800 ms, Request time: 22802 ms.
> >>> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670  KB, Res
> code: 200, QTime: 34193 ms, Request time: 34195 ms.
> >>> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675  KB, Res
> code: 200, QTime: 22250 ms, Request time: 22252 ms.
> >>>
> >>>
> >>>
> >>>
> >>> BRs
> >>>
> >>> //Aaron
> >>>
> >>> ________________________________
> >>> From: Chris Ulicny <culicny@iq.media>
> >>> Sent: Monday, March 18, 2019 2:54:25 PM
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Re: Solr index slow response
> >>>
> >>> One other thing to look at besides the heap is your commit settings.
> We've
> >>> experienced something similar, and changing commit settings alleviated
> the
> >>> issue.
> >>>
> >>> Are you opening a search on every hardcommit? If so, you might want to
> >>> reconsider and use the softcommit for the hourly creation of a new
> searcher.
> >>>
> >>> The hardcommit interval should be much lower. Probably something on the
> >>> order of seconds (15000) instead of hours (currently 3600000). When the
> >>> hard commit fires, numerous merges might be firing off in the
> background
> >>> due to the volume of documents you are indexing, which might explain
> the
> >>> periodic bad response times shown in your logs.
> >>>
> >>> It would depend on your specific scenario, but here's our setup. During
> >>> long periods of constant indexing of documents to a staging collection
> (~2
> >>> billion documents), we have following commit settings
> >>>
> >>> softcommit: 3600000ms (for periodic validation of data, since it's not
> in
> >>> production)
> >>> hardcommit: openSearcher -> false, 15000ms (no document limit)
> >>>
> >>> This makes the documents available for searching every hour, but
> doesn't
> >>> result in the large bursts of IO due to the infrequent hard commits.
> >>>
> >>> For more info, Erick Erickson has a great write up:
> >>>
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >>>
> >>> Best,
> >>> Chris
> >>>
> >>> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <YSun@vizrt.com>
> wrote:
> >>>
> >>>> Hi, Emir,
> >>>>
> >>>> My system used to run with max 32GB, the response time is bad as well.
> >>>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since
> >>>> there is such huge free memory.
> >>>>
> >>>> I could try to with set Xms and Xmx to the same value, but I doubt how
> >>>> much would that change the response time.
> >>>>
> >>>>
> >>>> BRs
> >>>>
> >>>> //Aaron
> >>>>
> >>>> ________________________________
> >>>> From: Emir Arnautović <emir.arnautovic@sematext.com>
> >>>> Sent: Monday, March 18, 2019 2:19:19 PM
> >>>> To: solr-user@lucene.apache.org
> >>>> Subject: Re: Solr index slow response
> >>>>
> >>>> Hi Aaron,
> >>>> Without looking too much into numbers, my bet would be that it is
> large
> >>>> heap that is causing issues. I would decrease is significantly
> (<30GB) and
> >>>> see if it is enough for your max load. Also, disable swap or reduce
> >>>> swappiness to min.
> >>>>
> >>>> In any case, you should install some monitoring tool that would help
> you
> >>>> do better analysis when you run into problems. One such tool is our
> >>>> monitoring solution: https://sematext.com/spm
> >>>>
> >>>> HTH,
> >>>> Emir
> >>>> --
> >>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>>
> >>>>
> >>>>
> >>>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <YSun@vizrt.com>
wrote:
> >>>>>
> >>>>> Hello, Emir,
> >>>>>
> >>>>> Thanks for the reply, this is the solr version and heap info,
> standalone
> >>>> single solr server. I don't have monitor tool connected. only look at
> >>>> 'top', has not seen cpu spike so far, when the slow response happens,
> cpu
> >>>> usage is not high at all, around 30%.
> >>>>>
> >>>>>
> >>>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true'
> >>>>> {
> >>>>> "responseHeader":{
> >>>>> "status":0,
> >>>>> "QTime":27},
> >>>>> "mode":"std",
> >>>>> "solr_home":"/ardome/solr",
> >>>>> "lucene":{
> >>>>> "solr-spec-version":"6.5.1",
> >>>>> "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
-
> >>>> jimczi - 2017-04-21 12:23:42",
> >>>>> "lucene-spec-version":"6.5.1",
> >>>>> "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75
> >>>> - jimczi - 2017-04-21 12:17:15"},
> >>>>> "jvm":{
> >>>>> "version":"1.8.0_144 25.144-b01",
> >>>>> "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM",
> >>>>> "spec":{
> >>>>>   "vendor":"Oracle Corporation",
> >>>>>   "name":"Java Platform API Specification",
> >>>>>   "version":"1.8"},
> >>>>> "jre":{
> >>>>>   "vendor":"Oracle Corporation",
> >>>>>   "version":"1.8.0_144"},
> >>>>> "vm":{
> >>>>>   "vendor":"Oracle Corporation",
> >>>>>   "name":"Java HotSpot(TM) 64-Bit Server VM",
> >>>>>   "version":"25.144-b01"},
> >>>>> "processors":32,
> >>>>> "memory":{
> >>>>>   "free":"69.1 GB",
> >>>>>   "total":"180.2 GB",
> >>>>>   "max":"266.7 GB",
> >>>>>   "used":"111 GB (%41.6)",
> >>>>>   "raw":{
> >>>>>     "free":74238728336,
> >>>>>     "total":193470136320,
> >>>>>     "max":286331502592,
> >>>>>     "used":119231407984,
> >>>>>     "used%":41.64103736566334}},
> >>>>> "jmx":{
> >>>>>
> >>>>
> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes",
> >>>>>   "classpath":"...",
> >>>>>   "commandLineArgs":["-Xms100G",
> >>>>>     "-Xmx300G",
> >>>>>     "-DSTOP.PORT=8079",
> >>>>>     "-DSTOP.KEY=..",
> >>>>>     "-Dsolr.solr.home=..",
> >>>>>     "-Djetty.port=8983"],
> >>>>>   "startTime":"2019-03-18T09:35:27.892Z",
> >>>>>   "upTimeMS":9258422}},
> >>>>> "system":{
> >>>>> "name":"Linux",
> >>>>> "arch":"amd64",
> >>>>> "availableProcessors":32,
> >>>>> "systemLoadAverage":14.72,
> >>>>> "version":"3.0.101-311.g08a8a9d-default",
> >>>>> "committedVirtualMemorySize":2547960700928,
> >>>>> "freePhysicalMemorySize":4530696192,
> >>>>> "freeSwapSpaceSize":3486846976,
> >>>>> "processCpuLoad":0.3257436126790475,
> >>>>> "processCpuTime":93869450000000,
> >>>>> "systemCpuLoad":0.3279781055816521,
> >>>>> "totalPhysicalMemorySize":406480175104,
> >>>>> "totalSwapSpaceSize":4302303232 <(430)%20230-3232>
> <(430)%20230-3232>,
> >>>>> "maxFileDescriptorCount":32768,
> >>>>> "openFileDescriptorCount":385,
> >>>>> "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14
> >>>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n",
> >>>>> "uptime":" 13:09pm  up 5 days 21:23,  7 users,  load average: 14.72,
> >>>> 12.28, 11.48\n"}}
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> ________________________________
> >>>>> From: Emir Arnautović <emir.arnautovic@sematext.com>
> >>>>> Sent: Monday, March 18, 2019 12:10:30 PM
> >>>>> To: solr-user@lucene.apache.org
> >>>>> Subject: Re: Solr index slow response
> >>>>>
> >>>>> Hi Aaron,
> >>>>> Which version of Solr? How did you configure your heap? Is it
> standalone
> >>>> Solr or SolrCloud? A single server? Do you use some monitoring tool?
> Do you
> >>>> see some spikes, pauses or CPU usage is constant?
> >>>>>
> >>>>> Thanks,
> >>>>> Emir
> >>>>> --
> >>>>> Monitoring - Log Management - Alerting - Anomaly Detection
> >>>>> Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <YSun@vizrt.com>
wrote:
> >>>>>>
> >>>>>> Hello, Solr!
> >>>>>>
> >>>>>>
> >>>>>> We are having some performance issue when try to send documents
for
> >>>> solr to index. The repose time is very slow and unpredictable some
> time.
> >>>>>>
> >>>>>>
> >>>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB
> RAM,
> >>>> while 300 GB is reserved for solr, while this happening, cpu usage is
> >>>> around 30%, mem usage is 34%.  io also look ok according to iotop.
> SSD disk.
> >>>>>>
> >>>>>>
> >>>>>> Our application send 100 documents to solr per request, json
> encoded.
> >>>> the size is around 5M each time. some times the response time is
> under 1
> >>>> seconds, some times could be 300 seconds, the slow response happens
> very
> >>>> often.
> >>>>>>
> >>>>>>
> >>>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited
for
> >>>> 3600000ms; if 1000000 uncommited docs"
> >>>>>>
> >>>>>>
> >>>>>> There are around 100 clients sending those documents at the
same
> time,
> >>>> but each for the client is blocking call which wait the http response
> then
> >>>> send the next one.
> >>>>>>
> >>>>>>
> >>>>>> I tried to make the number of documents smaller in one request,
> such as
> >>>> 20, but  still I see slow response time to time, like 80 seconds.
> >>>>>>
> >>>>>>
> >>>>>> Would you help to give some hint how improve the response time?
> solr
> >>>> does not seems very loaded, there must be a way to make the response
> faster.
> >>>>>>
> >>>>>>
> >>>>>> BRs
> >>>>>>
> >>>>>> //Aaron
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message