Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 73785 invoked from network); 17 Feb 2010 19:12:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Feb 2010 19:12:22 -0000 Received: (qmail 57000 invoked by uid 500); 17 Feb 2010 19:12:19 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 56940 invoked by uid 500); 17 Feb 2010 19:12:19 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 56930 invoked by uid 99); 17 Feb 2010 19:12:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2010 19:12:19 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [149.8.64.32] (HELO mclmx2.mail.saic.com) (149.8.64.32) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Feb 2010 19:12:11 +0000 Received: from 0015-its-sbg03.saic.com ([149.8.64.21] [149.8.64.21]) by mclmx2.mail.saic.com with ESMTP id BT-MMP-1836070 for java-user@lucene.apache.org; Wed, 17 Feb 2010 14:11:37 -0500 X-AuditID: 95084018-b7cceae000000ea8-4b-4b7c3f694e59 Received: from 0015-its-exbh03.us.saic.com (mcl-sixl-nat.saic.com [149.8.64.21]) by 0015-its-sbg03.saic.com (Symantec Brightmail Gateway) with SMTP id 58.C3.03752.96F3C7B4; Wed, 17 Feb 2010 14:11:37 -0500 (EST) Received: from 0015-its-exmb04.us.saic.com ([10.43.229.20]) by 0015-its-exbh03.us.saic.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 17 Feb 2010 14:11:35 -0500 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CAB004.F38D0D01" Subject: Question on highlighting of nested SpanQuery instances Date: Wed, 17 Feb 2010 14:11:04 -0500 Message-Id: <3B01AF82880E6947A069AA17FF5CFE1503D2693C@0015-its-exmb04.us.saic.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Question on highlighting of nested SpanQuery instances Thread-Index: AcqwBPOI+y3Jb+dBRDiwziMWUSIbyg== From: "Goddard, Michael J." To: X-OriginalArrivalTime: 17 Feb 2010 19:11:35.0974 (UTC) FILETIME=[06412C60:01CAB005] X-Brightmail-Tracker: AAAAAA== ------_=_NextPart_001_01CAB004.F38D0D01 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello, I'm seeking some help with a highlighting issue involving the SpanQuery = family. To illustrate my issue, I added a test to the existing = HighlighterTest (see diff, below, against tags/lucene_2_9_1). When this = test runs, it fails and the System.out.println yields this: Expected: "Sam dislikes most of the food and has to order fish = and chips - however the fish is frozen, not fresh. Observed: "Sam dislikes most of the food and has to order fish = and chips - however the fish is frozen, not fresh. That second "fish" doesn't satisfy the query, so I don't expect it to be = highlighted. Can anyone out there offer a good starting point on this = one? Regards, Mike Index: = contrib/highlighter/src/test/org/apache/lucene/search/highlight/Highlight= erTest.java =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- = contrib/highlighter/src/test/org/apache/lucene/search/highlight/Highlight= erTest.java (revision 908726) +++ = contrib/highlighter/src/test/org/apache/lucene/search/highlight/Highlight= erTest.java (working copy) @@ -173,7 +173,40 @@ "Query in a named field does not result in highlighting when = that field isn't in the query", s1, highlightField(q, FIELD_NAME, s1)); } + =20 + /* + * TODO: Why is that second instance of the term "fish" highlighted? = It is not + * followed by the term "chips", so it should not be highlighted. + */ + public void testHighlightingNestedSpans() throws Exception { =20 + String pubText =3D "Sam dislikes most of the food and has to = order" + + " fish and chips - however the fish is frozen, not fresh."; + =20 + String fieldName =3D "SOME_FIELD_NAME"; + + SpanOrQuery spanOr =3D new SpanOrQuery( + new SpanTermQuery[] { + new SpanTermQuery(new Term(fieldName, "fish")), + new SpanTermQuery(new Term(fieldName, "term1")), + new SpanTermQuery(new Term(fieldName, "term2")), + new SpanTermQuery(new Term(fieldName, "term3")) }); + =20 + SpanNearQuery innerSpanNear =3D new SpanNearQuery(new SpanQuery[] { + spanOr, + new SpanTermQuery(new Term(fieldName, "chips")) }, 2, true); + =09 + SpanNearQuery query =3D new SpanNearQuery(new SpanQuery[] { + innerSpanNear, + new SpanTermQuery(new Term(fieldName, "frozen")) }, 8, true); + =09 + String expected =3D "Sam dislikes most of the food and has to = order" + + " fish and chips - however the fish is = frozen, not fresh."; + String observed =3D highlightField(query, fieldName, pubText); + System.out.println("Expected: \"" + expected + "\n" + "Observed: = \"" + observed); + assertEquals("Why is that second instance of the term \"fish\" = highlighted?", expected, observed); + } + /** * This method intended for use with = testHighlightingWithDefaultField() * @throws InvalidTokenOffsetsException=20 ------_=_NextPart_001_01CAB004.F38D0D01--