lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "none none" <>
Subject Re: Highlighting Redux
Date Fri, 21 Mar 2003 01:45:09 GMT

On Thu, 20 Mar 2003 19:20:27  
 Ype Kingma wrote:
>On Thursday 20 March 2003 10:12, Leander Harding wrote:
>> Hi,
>>     Yes, it's another question about Term highlighting. Essentially, what
>> I'm looking to is obtain a set of term positions in a given document that
>> are hits for a given Query. I've read the archives and looked at the
>> contributed code, but it all fails in one important (to my employer)
>> respect: it doesn't understand the semantics of Lucene queries, rather it
>> looks at the terms they contain and highlights them all. Consider the
>> following query:
>> ("foo" AND "bar") OR "baz"
>> Suppose that we search using this query and the following document is a
>> hit: <doc>Foo.....quux......baz.</doc>

in the above example it should highlight just "baz",
there can be a document <doc>foo....</doc> where you see all 3 terms
highlighted (this has priority, the score will show this document first)
there can be a case <doc></doc> where "foo" and "bar" are highlighted.
so the combination you could have are:
foo bar baz
foo bar

run 3 test with 1 doc at the time with those combination and see what happen, if the above
is wrong we should ask some developer some explanation.

Hope help,

>> Which Terms do we highlight?
>> All of the existing highlighting code I've seen would highlight both "foo"
>> and "baz", but this isn't correct - the document contains "foo", but no
>> "bar", thus, since "foo" in the query is part of an AND expression that
>> wasn't satisfied by this document, only "baz" should be highlighted.
>According to the scoring documentation, although Foo is not a boolean match, 
>it is still used to determine the score, so one might as well highlight it.
>Kind regards,
>To unsubscribe, e-mail:
>For additional commands, e-mail:

Get 25MB, POP3, Spam Filtering with LYCOS MAIL PLUS for $19.95/year.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message