lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amin Mohammed-Coleman <ami...@gmail.com>
Subject Re: Lucene Highlighting and Dynamic Summaries
Date Mon, 09 Mar 2009 07:50:41 GMT
Hi
I am seeing some strange behaviour with the highlighter and I'm wondering if
anyone else is experiencing this.  In certain instances I don't get a
summary being generated.  I perform the search and the search returns the
correct document.  I can see that the lucene document contains the text in
the field.  However after doing:

SimpleHTMLFormatter simpleHTMLFormatter = new SimpleHTMLFormatter("<span
class=\"highlight\"><b>", "</b></span>");

//required for highlighting

Query query2 = multiSearcher.rewrite(query);

Highlighter highlighter = new Highlighter(simpleHTMLFormatter,
newQueryScorer(query2));

...

String text= doc.get(FieldNameEnum.BODY.getDescription());

                TokenStream tokenStream = analyzer
.tokenStream(FieldNameEnum.BODY.getDescription(), new StringReader(text));

                String result = highlighter.getBestFragments(tokenStream,
text, 3, "...");




the string result is empty.  This is very strange, if i try a different term
that exists in the document then I get a summary.  For example I have a word
document that contains the term "document" and "aspectj".  If I search for
"document" I get the correct document but no highlighted summary.  However
if I search using "aspectj" I get the same doucment with highlighted
summary.


Just to mentioned I do rewrite the original query before performing the
highlighting.


I'm not sure what i'm missing here.  Any help would be appreciated.


Cheers

Amin

On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:

> Hi
> Got it working!  Thanks again for your help!
>
>
> Amin
>
>
> On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:
>
>> Thanks!  The final piece that I needed to do for the project!
>> Cheers
>>
>> Amin
>>
>> On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
>>
>>> > cool.  i will use compression and store in index. is there anything
>>> > special
>>> > i need to for decompressing the text? i presume i can just do
>>> > doc.get("content")?
>>> > thanks for your advice all!
>>>
>>> No just use Field.Store.COMPRESS when adding to index and Document.get()
>>> when fetching. The decompression is automatically done.
>>>
>>> You may think, why not enable compression for all fields? The case is,
>>> that
>>> this is an overhead for very small and short fields. So you should only
>>> use
>>> it for large contents (it's the same like compressing very small files as
>>> ZIP/GZIP: These files mostly get larger than without compression).
>>>
>>> Uwe
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message