lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amin Mohammed-Coleman <>
Subject Re: Lucene Highlighting and Dynamic Summaries
Date Mon, 09 Mar 2009 07:50:41 GMT
I am seeing some strange behaviour with the highlighter and I'm wondering if
anyone else is experiencing this.  In certain instances I don't get a
summary being generated.  I perform the search and the search returns the
correct document.  I can see that the lucene document contains the text in
the field.  However after doing:

SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter("<span
class=\"highlight\"><b>", "</b></span>");

//required for highlighting

Query query2 = multiSearcher.rewrite(query);

Highlighter highlighter = new Highlighter(simpleHTMLFormatter,


String text= doc.get(FieldNameEnum.BODY.getDescription());

                TokenStream tokenStream = analyzer
.tokenStream(FieldNameEnum.BODY.getDescription(), new StringReader(text));

                String result = highlighter.getBestFragments(tokenStream,
text, 3, "...");

the string result is empty.  This is very strange, if i try a different term
that exists in the document then I get a summary.  For example I have a word
document that contains the term "document" and "aspectj".  If I search for
"document" I get the correct document but no highlighted summary.  However
if I search using "aspectj" I get the same doucment with highlighted

Just to mentioned I do rewrite the original query before performing the

I'm not sure what i'm missing here.  Any help would be appreciated.



On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman <>wrote:

> Hi
> Got it working!  Thanks again for your help!
> Amin
> On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman <>wrote:
>> Thanks!  The final piece that I needed to do for the project!
>> Cheers
>> Amin
>> On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <> wrote:
>>> > cool.  i will use compression and store in index. is there anything
>>> > special
>>> > i need to for decompressing the text? i presume i can just do
>>> > doc.get("content")?
>>> > thanks for your advice all!
>>> No just use Field.Store.COMPRESS when adding to index and Document.get()
>>> when fetching. The decompression is automatically done.
>>> You may think, why not enable compression for all fields? The case is,
>>> that
>>> this is an overhead for very small and short fields. So you should only
>>> use
>>> it for large contents (it's the same like compressing very small files as
>>> ZIP/GZIP: These files mostly get larger than without compression).
>>> Uwe
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message