lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TJ Kolev <tjko...@gmail.com>
Subject Re: Highlighter and ConstantScoreQuery don't play together
Date Thu, 20 May 2010 13:34:33 GMT
Solution.

My code below is fine. The have highlighting I had to call
SetRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE) on any
MultiTermQuery I have.

Thanks for the tip.

tjk :)

On Tue, May 18, 2010 at 4:53 PM, TJ Kolev <tjkolev@gmail.com> wrote:

> I do. I think. Time for more codes I guess.
>
> See the comment in HighlightTerm().
>
> tjk :)
>
> Query rewrittenQuery = query.Rewrite(searcher.Reader); // searcher is an
> IndexSearcher opened earlier
> HighlightsFormatter highlighter = new HighlightsFormatter(maxSnippets,
> fragmentSize, contentLength, rewrittenQuery);
> string content = GetContent(pathToContentFile, contentLength); // gets the
> text content from my storage that was indexed, max contentLength. I don't
> store the content in the index.
> result.Snippets = highlighter.DoStandardHighlights(content);
>
> class HighlightsFormatter : Formatter
> {
>     private readonly Highlighter _highlighter;
>     private readonly int _maxHighlightLengthInBytes;
>     private readonly int _snippetsMaxNumber;
>     private readonly string _snippetsFormat =
> Configuration.GetStringAppSetting(Constants.ApplicationSettings.SnippetsHighlightFormat,
> "<span style=\"background-color: #ffff00;\">{0}</span>");
>     private static readonly SimpleHTMLEncoder _encoder = new
> SimpleHTMLEncoder();
>
>     public HighlightsFormatter(int maxSnippets, int fragmentSize, int
> maxContentLengthForHL, Query query)
>     {
>         Debug.assert(null != query);
>         _highlighter = new Highlighter(this, _encoder, new
> QueryScorer(query, Constants.SearchFields.FIELD_CONTENT));
>         _maxHighlightLengthInBytes = Math.Min(maxContentLengthForHL,
> Settings.MaxContentLengthForHighlightsInBytes());
>         _highlighter.SetMaxDocBytesToAnalyze(_maxHighlightLengthInBytes);
> // bytes vs chars?
>
>         _highlighter.SetTextFragmenter(new SimpleFragmenter(fragmentSize));
>         _snippetsMaxNumber = maxSnippets;
>     }
>
>     // Formatter contract
>     public virtual String HighlightTerm(String originalText, TokenGroup
> group)
>     {
> /*
> This is where it fails. I never get anything but totScore of 0, so I am not
> applying
> the highlighting. I tracked it down to that call in ConstantScoreQuery.
> Before it
> worked fine - the correct strings had totScore > 0 and they were
> highlighted.
> */
>         float totScore = group.GetTotalScore();
>         if (totScore <= 0f)
>             return originalText;
>         return String.Format(_snippetsFormat, originalText);
>     }
>
>     public String[] DoStandardHighlights(String content)
>     {
>         if (string.IsNullOrEmpty(content))
>             return null;
>
>         try
>         {
>             return _highlighter.GetBestFragments(IndexAnalyzer,
> Constants.SearchFields.FIELD_CONTENT, content, _snippetsMaxNumber);
>         }
>         catch (Exception ex)
>         {
>             _log.Warn("Snippet highlighting failed with exception", ex);
>             return null;
>         }
>     }
> }
>
>
> On Tue, May 18, 2010 at 3:57 PM, Digy <digydigy@gmail.com> wrote:
>
>> Then, you should think of using Query.Rewrite for your queries.
>>
>> DIGY.
>>
>> -----Original Message-----
>> From: TJ Kolev [mailto:tjkolev@gmail.com]
>> Sent: Tuesday, May 18, 2010 11:41 PM
>> To: lucene-net-user@lucene.apache.org
>> Subject: Re: Highlighter and ConstantScoreQuery don't play together
>>
>> I am not using a parser. I build my query as a BooleanQuery using various
>> Query and Filter objects.
>>
>> tjk :)
>>
>> On Tue, May 18, 2010 at 2:48 PM, Robert Jordan <robertj@gmx.net> wrote:
>>
>> > On 18.05.2010 17:09, TJ Kolev wrote:
>> >
>> >> Greetings!
>> >>
>> >> I am in the process of upgrading from 2.3.1 to 2.9.2 and my highlighter
>> >> stopped working. I tracked the issue down to this code in
>> >> Lucene.Net.Search.
>> >> ConstantScoreQuery:
>> >>
>> >>         public override void  ExtractTerms(System.Collections.Hashtable
>> >> terms)
>> >>         {
>> >>             // OK to not add any terms when used for MultiSearcher,
>> >>             // but may not be OK for highlighting
>> >>         }
>> >>
>> >> The highlighter goes through it and the terms are not populated.
>> >>
>> >> What is to be done to make it work?
>> >>
>> >> I found https://issues.apache.org/jira/browse/LUCENE-1731 and it
>> implies
>> >> it
>> >> is fixed for 2.9. How valid is this for the .net release?
>> >>
>> >> I can supply more code if needed.
>> >>
>> >
>> > A workaround for this is calling
>> >
>> > SetMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
>> >
>> > on the query parser that creates the query you're using for the
>> > highlighter.
>> >
>> > IIRC, even Lucene/Java is needing this call.
>> >
>> > Robert
>> >
>> >
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message