lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amrit Sarkar <sarkaramr...@gmail.com>
Subject Re: Highlighting Performance improvement suggestions required - Solr 6.5.1
Date Wed, 09 Aug 2017 14:39:12 GMT
Pardon I didn't go through details in configs and I guess you have already
went through the recent talks on highlighters, still sharing if not:

https://www.slideshare.net/lucidworks/solr-highlighting-at-full-speed-presented-by-timothy-rodriguez-bloomberg-david-smiley-d-w-smiley-llc
https://www.youtube.com/watch?v=tv5qKDKW8kk

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Wed, Aug 9, 2017 at 7:45 PM, sasarun <sasarun@gmail.com> wrote:

> Hi All,
>
> I found quite a few discussions on the highlighting performance issue.
> Though I tried to implement most of them, performance improvement was
> negative.
> Currently index count is really low with about 922 records . But the field
> on which highlighting is done is quite large data. Querying of data with
> highlighting is taking lots of time with 85-90% time taken on highlighting.
> Configuration of  my set schema.xml is as below
>
> fieldType name="text_general" class="solr.TextField"
> positionIncrementGap="100">
>     <analyzer type="index">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt" />
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>     </fieldType>
> <field name="customContent" type="text_general" indexed="true"
> stored="true"
> termVectors="true" termPositions="true" termOffsets="true"
> storeOffsetsWithPositions="true"/>
> <field name="customContent_term" type="text_general" indexed="false"
> stored="true"/>
>     <copyField source="customContent"   dest="customContent_term"/>
>
> Query used in solr is
>
> hl=true&hl.fl=customContent&hl.fragsize=500&hl.simple.pre=
> <HL>&hl.simple.post=</HL>&hl.snippets=1&hl.method=unified&
> hl.bs.type=SENTENCE&hl.fragListBuilder=simple&hl.
> maxAnalyzedChars=214748364&facet=true&facet.mincount=1&
> facet.limit=-1&facet.s
> ort=count&debug=timing&facet.field=contentSpecific
>
> Also note that We had tried fastvectorhighlighter too but the result was
> not
> positive. Once when we tried to hl.offsetSource="term_vectors" with unified
> result came up in half a second but it didnt had any highlight snippets.
>
> One of the debug returned by solr is shared below for reference
>
> time=8833.0,prepare={time=0.0,query={time=0.0},facet={time=
> 0.0},facet_module={time=0.0},mlt={time=0.0},hig
> hlight={time=0.0},stats={time=0.0},expand={time=0.0},terms={
> time=0.0},debug={time=0.0}},process={time=8826.0,query={
> time=867.0},facet={time=2.0},facet_module={time=0.0},mlt={
> time=0.0},highlight={time=7953.0},stats={time=0.0},expand={time=0.0},ter
> ms={time=0.0},debug={time=0.0}},loadFieldValues={time=28.0}}
>
> Any suggestions to  improve the performance would be of great help
>
> Thanks,
> Arun
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Highlighting-Performance-improvement-
> suggestions-required-Solr-6-5-1-tp4349767.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message