lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karsten Konrad" <>
Subject AW: Highlight Search Result
Date Fri, 04 Apr 2003 09:12:53 GMT


another simple solution to highlighting is to (1) collect the terms
that are used when searching in the idf(term, searcher)
method of Similarity and then (2) to use this collection for 

You must define your own sub-class of Similarity to do this
and put a new instance of Similarity into the IndexSearcher 
whenever you start a new search. Also, the terms stored in the 
collection are the stemmed versions, so your highlighting code 
must use the same stemmer to recognice the terms. However, no 
changes to Lucene's code are necessary to do all this.




Dr.-Ing. Karsten Konrad
Head of Information Agent Engineering

XtraMind Technologies GmbH
Stuhlsatzenhausweg 3
D-66123 Saarbr├╝cken
Phone: +49 (681) 3025113
Fax: +49 (681) 3025109

-----Urspr├╝ngliche Nachricht-----
Von: Michael Wechner []
Gesendet: Freitag, 4. April 2003 11:00
An: Lucene Developers List
Betreff: Re: Highlight Search Result

Lixin Meng wrote:
> When I was looking for a solution that can highlight the query terms in the
> search result, I came cross the following one.
> It sounds a good solution to me. However, to make it working, one need to
> modify Lucene source code (such as change some private declaration to
> public). I guess you guys already know about it. Just wonder if there is any
> plan (or there is any procedure) to incorporate the suggestions into Lucene
> code base?
> If the answer is no, anybody knows other solution, which doesn't require
> code change, for highlighting?

We implemented Lucene into Apache Lenya (formerly known as Wyona CMS) 
and also offer highlighting by dumping (during crawling) "htdocs" onto 
the filesystem and after the search we read the files (corresponding to 
the hits) and are able to generate the excerpts with highlighting.

You can see it in action at

You can download the code from:

I think Doug Cutting wrote on the mailing list some time ago, that you 
shouldn't put the content as a field into the index, because the index 
will blow up and search performance will be bad.
But I guess in the case of just light and only a few documents it 
wouldn't matter that much. Hence we probably enhance our solution such 
that you can set a flag where the content shall be stored, either within 
the index or on the filesystem.



> I am hesitating to make a variation out of Lucene main stream, since I will
> have to patch it everytime Lucene has an new release. After all, I just want
> to use it.
> Regards,
> Lixin
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message