lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Heikki Doeleman <hdoele...@amadeus.com>
Subject Re: Highlighting span for Phrase Queries
Date Tue, 14 Nov 2006 09:08:59 GMT
Hi Mark,

thanks for pointing out these, however neither seems to do exactly what I 
want, i.e. highlight a phrase when a phrase search was done.

All of these highlighting solutions seem concerned with selecting "the 
best bits" of a document, along with highlighting some parts thereof. To 
me this seems like a mix-up of different functionalities; I would expect a 
highlighter to do just the highlighting, and nothing else (not that this 
has anything to do with my phrase highlighting problem -- just wondering).



Heikki DOELEMAN





mark harwood <markharw00d@yahoo.co.uk> 
To
java-user@lucene.apache.org
cc

bcc

Subject
Re: Highlighting span for Phrase Queries





mark harwood <markharw00d@yahoo.co.uk> 
Please respond to java-user@lucene.apache.org
10/11/2006 17:46


There have been a couple of alternative Highlighter contributions 
recently, I can't recall which claim to support "proper" highlighting of 
phrases but you might want to give them a try.


http://issues.apache.org/jira/browse/LUCENE-644

http://issues.apache.org/jira/browse/LUCENE-663


Ultimately "proper" highlighting is very hard to achieve if you both want 
to select a summary of the doc's "best bits" and also support queries 
which can be arbitrarily complex nestings of boolean queries which can 
also contain Spans/phrases covering large sections of the document. Some 
compromises are inevitable under these extreme circumstances and I don't 
think there is one implementation that is capable of catering for this.

Let us know how you get on.

Cheers
Mark


----- Original Message ----
From: Heikki Doeleman <hdoeleman@amadeus.com>
To: java-user@lucene.apache.org
Sent: Friday, 10 November, 2006 2:45:23 PM
Subject: Highlighting span for Phrase Queries

Hi there,

I have a question on using the Highlighter.

I'm using Lucene in a web application that allows you to search the 
catalogue of a library. The idea is to highlight, in the results, the 
terms entered by the user. I'm using a Highlighter with a NullFragmenter 
because I want the whole field highlighted (for each field the user 
searched in).

The catalogue is indexed with a StandardAnalyzer on each field. 
Highlighting a field from a result I do like this  :

            QueryScorer scorer = new QueryScorer ( theQuery, 
searcher.getIndexReader ( ) , fieldName ) ;
            Highlighter highlighter = new Highlighter ( new 
SimpleHTMLFormatter ( "<span style=\"border-style:solid;\">" , "</span>" ) 

, scorer ) ;
            highlighter. setTextFragmenter ( new NullFragmenter ( ) ) ;

        highlightedFieldContent = highlighter. getBestFragment ( analyzer 
, fieldName , originalFieldContent ) ; 

This works all very well, except for phrase queries. The spans of the 
phrase queries as such are not highlighted and instead, each of the terms 
that is in the phrase query, gets highlighted. I guess this is because the 

indexed fields have been tokenized, and what-not, by the StandardAnalyzer.

Does anyone have a good example about how to implement a highlighting 
function that works well with phrase queries, too ?

thank you very much.

Heikki DOELEMAN




 
___________________________________________________________ 
Now you can scan emails quickly with a reading pane. Get the new Yahoo! 
Mail. http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message