lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Günter Hipler <guenter.hip...@unibas.ch>
Subject slow solr facet processing
Date Thu, 31 Aug 2017 14:41:36 GMT
Hi,

in the meantime I came across the reason for the slow facet processing 
capacities of SOLR since version 5.x

  https://issues.apache.org/jira/browse/SOLR-8096
https://issues.apache.org/jira/browse/LUCENE-5666

compared to version 4.x

Various library networks across the world are suffering from the same 
symptoms:

Facet processing is one of the most important features of a search 
server (for us) and it seems (at least IMHO) there is no solution for 
the issue since March 2015 (release date for the last SOLR 4 version)

What are the plans / ideas of the solr developers for a possible future 
solution? Or maybe there is already a solution I haven't seen so far.

Thanks for a feedback

Günter



On 21.08.2017 15:35, guenterh.lists@bluewin.ch wrote:
> Hi,
>
> I can't figure out the reason why the facet processing in version 6 
> needs significantly more time compared to version 4.
>
> The debugging response (for 30 million documents)
>
> solr 4
> <lst name="process"><double name="time">280.0</double><lst 
> name="query"><double name="time">0.0</double></lst><lst 
> name="facet"><double name="time">280.0</double></lst>
> (once the query is cached)
> before caching: between 1.5 and 2 sec
>
>
> solr 6.x (my last try was with 6.6)
> without docvalues for facetting fields (same schema as version 4)
> <lst name="process"><double name="time">5874.0</double><lst 
> name="query"><double name="time">0.0</double></lst><lst 
> name="facet"><double name="time">5873.0</double></lst><lst 
> name="facet_module"><double name="time">0.0</double></lst>
> the time is not getting better even after repeating the query several 
> times
>
>
> solr 6.6 with docvalues for facetting fields
> <lst name="process"><double name="time">9837.0</double><lst 
> name="query"><double name="time">0.0</double></lst><lst 
> name="facet"><double name="time">9837.0</double></lst><lst 
> name="facet_module"><double name="time">0.0</double></lst>
>
> used query (our productive system with version 4)
> http://search.swissbib.ch/solr/sb-biblio/select?debugQuery=true&q=*:*&facet=true&facet.field=union&facet.field=navAuthor_full&facet.field=format&facet.field=language&facet.field=navSub_green&facet.field=navSubform&facet.field=publishDate&qt=edismax&ps=2&json.nl=arrarr&bf=recip(abs(ms(NOW/DAY,freshness)),3.16e-10,100,100)&fl=*,score&hl.fragsize=250&start=0&q.op=AND&sort=score+desc&rows=0&hl.simple.pre={{{{START_HILITE}}}}&facet.limit=100&hl.simple.post={{{{END_HILITE}}}}&spellcheck=false&qf=title_short^1000+title_alt^200+title_sub^200+title_old^200+title_new^200+author^750+author_additional^100+author_additional_dsv11_txt_mv^100+title_additional_dsv11_txt_mv^100+series^200+topic^500+addfields_txt_mv^50+publplace_txt_mv^25+publplace_dsv11_txt_mv^25+fulltext+callnumber^1000+ctrlnum^1000+publishDate+isbn+variant_isbn_isn_mv+issn+localcode+id&pf=title_short^1000&facet.mincount=1&hl.fl=fulltext&&wt=xml&facet.sort=count
>
>
> Running the queries on smaller indices (8 million docs) the difference 
> is similar although the absolut figures for processing time are smaller
>
>
> Any hints why this huge differences?
>
> Günter
>
>
>
>
>
>
>
>
>

-- 
Universität Basel
Universitätsbibliothek
Günter Hipler
Projekt SwissBib
Schoenbeinstrasse 18-20
4056 Basel, Schweiz
Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
E-Mail guenter.hipler@unibas.ch
URL: www.swissbib.org  / http://www.ub.unibas.ch/


Mime
View raw message