lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Wolanin <peter.wola...@acquia.com>
Subject Re: Highlighting performance between 1.3 and 1.4rc
Date Sat, 07 Nov 2009 04:35:03 GMT
Trying to clarify when the new behavior is useful - if I'm using the
dismax handler, then would it make sense to always default to
usePhraseHighlighter=false?

-Peter

On Wed, Nov 4, 2009 at 1:42 AM, Jake Brownell <jakeb@benetech.org> wrote:
> Thanks Mark, that did bring the time back down. I'll have to investigate a little more,
and weigh the pros of each to determine which best suits are needs.
>
> Jake
>
> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: Tuesday, November 03, 2009 11:23 PM
> To: solr-user@lucene.apache.org
> Cc: solr-user@lucene.apache.org
> Subject: Re: Highlighting performance between 1.3 and 1.4rc
>
> The 1.4 highlighter is Now slower if you have multi term queries or
> phrase queries. You can get the old behavior (which is faster) if you
> pass usePhraseHighlighter=false - but you will not get correct phrase
> highlighting and multi term queries won't highlight - eg prefix/
> wildcard/range.
>
> - Mark
>
> http://www.lucidimagination.com (mobile)
>
> On Nov 3, 2009, at 8:18 PM, Jake Brownell <jakeb@Benetech.org> wrote:
>
>> Hi,
>>
>> The fix MarkM provided yesterday for the problem I reported
>> encountering with the highlighter appears to be working--I installed
>> the Lucene 2.9.1 rc4 artifacts.
>>
>> Now I'm running into an oddity regarding performance. Our
>> integration test is running slower than it used to. I've placed some
>> average timings below. I'll try to describe what the test does in
>> the hopes that someone will have some insight.
>>
>> The indexing time represents the time it takes to load and index/
>> commit ~43 books. The test then does two sets of searches.
>>
>> A basic search is a dismax search across several fields including
>> the text of the book. It searches either the exact title (in quotes)
>> or the ISBN. Highlighting is enabled on the field that holds the
>> text of the book.
>>
>> An advanced search uses a nested dismax (inside a normal Lucene), to
>> search for either the exact title (in quotes) or the ISBN. The main
>> difference is that the title is only matched against fields related
>> to titles, not authors, text of the book, etc. Highlighting is
>> enabled against the text of the book.
>>
>> The indexing time remained fairly constant. I ran with and without
>> highlighting enabled, to see how much it was contributing. I am most
>> interested in the jumps in time between 1.3 and 1.4 for the
>> highlighting time.
>>
>> with highlighting enabled
>> solr 1.3
>> Indexing: 40161ms
>> Basic: 12407ms
>> Advanced: 1106ms
>>
>>
>> solr 1.4 rc
>> Indexing: 41734ms
>> Basic: 26346ms
>> Advanced: 17067ms
>>
>>
>> without any highlighting
>> solr 1.3
>> Indexing: 41186ms
>> Basic: 1024ms
>> Advanced: 265ms
>>
>> solr 1.4 rc
>> Indexing: 40981ms
>> Basic: 883ms
>> Advanced: 356ms
>>
>> FWIW, the integration test uses an embedded solr server.
>>
>> I supposed I should also ask if there are any general tips to speed
>> up highlighting?
>>
>> Thanks,
>> Jake
>



-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wolanin@acquia.com

Mime
View raw message