lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amin Mohammed-Coleman <ami...@gmail.com>
Subject Re: Lucene Highlighting and Dynamic Summaries
Date Wed, 11 Mar 2009 17:59:25 GMT
Hi

Apologies for re sending this mail. Just wondering if anyone has  
experienced the below. I'm not sure if this could happen due nature of  
document. It does seem strange one term search returns summary while  
another does not even though same document is being returned.

I'm asking this so I can code around this if is normal.


Apologies again for re sending this mail

Cheers

Amin

Sent from my iPhone

On 9 Mar 2009, at 07:50, Amin Mohammed-Coleman <aminmc@gmail.com> wrote:

> Hi
>
> I am seeing some strange behaviour with the highlighter and I'm  
> wondering if anyone else is experiencing this.  In certain instances  
> I don't get a summary being generated.  I perform the search and the  
> search returns the correct document.  I can see that the lucene  
> document contains the text in the field.  However after doing:
>
> 	SimpleHTMLFormatter simpleHTMLFormatter = new  
> SimpleHTMLFormatter("<span class=\"highlight\"><b>", "</b></span>");
> 			//required for highlighting
> 			Query query2 = multiSearcher.rewrite(query);
> 			Highlighter highlighter = new Highlighter(simpleHTMLFormatter,  
> new QueryScorer(query2));
> ...
>
> String text= doc.get(FieldNameEnum.BODY.getDescription());
>                 TokenStream tokenStream =  
> analyzer.tokenStream(FieldNameEnum.BODY.getDescription(), new  
> StringReader(text));
>                 String result =  
> highlighter.getBestFragments(tokenStream, text, 3, "...");
>
>
> the string result is empty.  This is very strange, if i try a  
> different term that exists in the document then I get a summary.   
> For example I have a word document that contains the term "document"  
> and "aspectj".  If I search for "document" I get the correct  
> document but no highlighted summary.  However if I search using  
> "aspectj" I get the same doucment with highlighted summary.
>
> Just to mentioned I do rewrite the original query before performing  
> the highlighting.
>
> I'm not sure what i'm missing here.  Any help would be appreciated.
>
> Cheers
> Amin
>
> On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman <aminmc@gmail.com 
> > wrote:
> Hi
>
> Got it working!  Thanks again for your help!
>
>
> Amin
>
>
> On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman <aminmc@gmail.com 
> > wrote:
> Thanks!  The final piece that I needed to do for the project!
>
> Cheers
>
> Amin
>
> On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <uwe@thetaphi.de>  
> wrote:
> > cool.  i will use compression and store in index. is there anything
> > special
> > i need to for decompressing the text? i presume i can just do
> > doc.get("content")?
> > thanks for your advice all!
>
> No just use Field.Store.COMPRESS when adding to index and  
> Document.get()
> when fetching. The decompression is automatically done.
>
> You may think, why not enable compression for all fields? The case  
> is, that
> this is an overhead for very small and short fields. So you should  
> only use
> it for large contents (it's the same like compressing very small  
> files as
> ZIP/GZIP: These files mostly get larger than without compression).
>
> Uwe
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message