Return-Path: Delivered-To: apmail-lucene-lucene-net-user-archive@www.apache.org Received: (qmail 21899 invoked from network); 20 May 2010 13:35:04 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 May 2010 13:35:04 -0000 Received: (qmail 79204 invoked by uid 500); 20 May 2010 13:35:04 -0000 Delivered-To: apmail-lucene-lucene-net-user-archive@lucene.apache.org Received: (qmail 79161 invoked by uid 500); 20 May 2010 13:35:03 -0000 Mailing-List: contact lucene-net-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucene-net-user@lucene.apache.org Delivered-To: mailing list lucene-net-user@lucene.apache.org Received: (qmail 79153 invoked by uid 99); 20 May 2010 13:35:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 May 2010 13:35:03 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tjkolev@gmail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vw0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 May 2010 13:34:54 +0000 Received: by vws18 with SMTP id 18so442216vws.35 for ; Thu, 20 May 2010 06:34:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=sYC6Y11Zq4eF9He2NePTtQBpu1olW51NGLw2H0oKLb8=; b=UrK864zJDLvEr8DPf7mmbNQ53jjl9xqzpCvgox9SWmkv53d9YF2ARJekpL+LiKy7Nm dj9LxlVXwH5z5hnZ7xn+PmX1FGD6mCrefkPwGm+BL/aNMwQo7xia7Nzu0LIX3FY8+JVM HS52nwPIlZe5R6Y+YLhw7nrvzM6lhtJL2lIYc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Lew92f/IBakbCxC47qV+PZj55v+TfBnihe1HN8ypXDhstIz6ra/9NvoT0HxGD1drYW OcNkMI7BsE1qpq9PmAeGExp11fQ4GF6hbENU/oxWvvQNkpEGWRyCJk/Bz5r8Y3y3j6ce k/qHOonV57PdU7QXrSUBOl/72xM3RpkXyJFss= MIME-Version: 1.0 Received: by 10.229.219.211 with SMTP id hv19mr8003qcb.70.1274362473523; Thu, 20 May 2010 06:34:33 -0700 (PDT) Received: by 10.229.214.5 with HTTP; Thu, 20 May 2010 06:34:33 -0700 (PDT) In-Reply-To: References: <006201caf6cc$c3c04740$4b40d5c0$@com> Date: Thu, 20 May 2010 08:34:33 -0500 Message-ID: Subject: Re: Highlighter and ConstantScoreQuery don't play together From: TJ Kolev To: lucene-net-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016362837d88ca8c6048706a237 X-Virus-Checked: Checked by ClamAV on apache.org --0016362837d88ca8c6048706a237 Content-Type: text/plain; charset=ISO-8859-1 Solution. My code below is fine. The have highlighting I had to call SetRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE) on any MultiTermQuery I have. Thanks for the tip. tjk :) On Tue, May 18, 2010 at 4:53 PM, TJ Kolev wrote: > I do. I think. Time for more codes I guess. > > See the comment in HighlightTerm(). > > tjk :) > > Query rewrittenQuery = query.Rewrite(searcher.Reader); // searcher is an > IndexSearcher opened earlier > HighlightsFormatter highlighter = new HighlightsFormatter(maxSnippets, > fragmentSize, contentLength, rewrittenQuery); > string content = GetContent(pathToContentFile, contentLength); // gets the > text content from my storage that was indexed, max contentLength. I don't > store the content in the index. > result.Snippets = highlighter.DoStandardHighlights(content); > > class HighlightsFormatter : Formatter > { > private readonly Highlighter _highlighter; > private readonly int _maxHighlightLengthInBytes; > private readonly int _snippetsMaxNumber; > private readonly string _snippetsFormat = > Configuration.GetStringAppSetting(Constants.ApplicationSettings.SnippetsHighlightFormat, > "{0}"); > private static readonly SimpleHTMLEncoder _encoder = new > SimpleHTMLEncoder(); > > public HighlightsFormatter(int maxSnippets, int fragmentSize, int > maxContentLengthForHL, Query query) > { > Debug.assert(null != query); > _highlighter = new Highlighter(this, _encoder, new > QueryScorer(query, Constants.SearchFields.FIELD_CONTENT)); > _maxHighlightLengthInBytes = Math.Min(maxContentLengthForHL, > Settings.MaxContentLengthForHighlightsInBytes()); > _highlighter.SetMaxDocBytesToAnalyze(_maxHighlightLengthInBytes); > // bytes vs chars? > > _highlighter.SetTextFragmenter(new SimpleFragmenter(fragmentSize)); > _snippetsMaxNumber = maxSnippets; > } > > // Formatter contract > public virtual String HighlightTerm(String originalText, TokenGroup > group) > { > /* > This is where it fails. I never get anything but totScore of 0, so I am not > applying > the highlighting. I tracked it down to that call in ConstantScoreQuery. > Before it > worked fine - the correct strings had totScore > 0 and they were > highlighted. > */ > float totScore = group.GetTotalScore(); > if (totScore <= 0f) > return originalText; > return String.Format(_snippetsFormat, originalText); > } > > public String[] DoStandardHighlights(String content) > { > if (string.IsNullOrEmpty(content)) > return null; > > try > { > return _highlighter.GetBestFragments(IndexAnalyzer, > Constants.SearchFields.FIELD_CONTENT, content, _snippetsMaxNumber); > } > catch (Exception ex) > { > _log.Warn("Snippet highlighting failed with exception", ex); > return null; > } > } > } > > > On Tue, May 18, 2010 at 3:57 PM, Digy wrote: > >> Then, you should think of using Query.Rewrite for your queries. >> >> DIGY. >> >> -----Original Message----- >> From: TJ Kolev [mailto:tjkolev@gmail.com] >> Sent: Tuesday, May 18, 2010 11:41 PM >> To: lucene-net-user@lucene.apache.org >> Subject: Re: Highlighter and ConstantScoreQuery don't play together >> >> I am not using a parser. I build my query as a BooleanQuery using various >> Query and Filter objects. >> >> tjk :) >> >> On Tue, May 18, 2010 at 2:48 PM, Robert Jordan wrote: >> >> > On 18.05.2010 17:09, TJ Kolev wrote: >> > >> >> Greetings! >> >> >> >> I am in the process of upgrading from 2.3.1 to 2.9.2 and my highlighter >> >> stopped working. I tracked the issue down to this code in >> >> Lucene.Net.Search. >> >> ConstantScoreQuery: >> >> >> >> public override void ExtractTerms(System.Collections.Hashtable >> >> terms) >> >> { >> >> // OK to not add any terms when used for MultiSearcher, >> >> // but may not be OK for highlighting >> >> } >> >> >> >> The highlighter goes through it and the terms are not populated. >> >> >> >> What is to be done to make it work? >> >> >> >> I found https://issues.apache.org/jira/browse/LUCENE-1731 and it >> implies >> >> it >> >> is fixed for 2.9. How valid is this for the .net release? >> >> >> >> I can supply more code if needed. >> >> >> > >> > A workaround for this is calling >> > >> > SetMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE); >> > >> > on the query parser that creates the query you're using for the >> > highlighter. >> > >> > IIRC, even Lucene/Java is needing this call. >> > >> > Robert >> > >> > >> >> > --0016362837d88ca8c6048706a237--