Hi
Please find attadched a test case plus a document. Just to mention this
occurs sometimes for other files.
Cheers
Amin
On Wed, Mar 11, 2009 at 6:11 PM, markharw00d <markharw00d@yahoo.co.uk>wrote:
> If you can supply a Junit test that recreates the problem I think we can
> start to make progress on this.
>
>
>
> Amin Mohammed-Coleman wrote:
>
>> Hi
>>
>> Apologies for re sending this mail. Just wondering if anyone has
>> experienced the below. I'm not sure if this could happen due nature of
>> document. It does seem strange one term search returns summary while another
>> does not even though same document is being returned.
>>
>> I'm asking this so I can code around this if is normal.
>>
>>
>> Apologies again for re sending this mail
>>
>> Cheers
>>
>> Amin
>>
>> Sent from my iPhone
>>
>> On 9 Mar 2009, at 07:50, Amin Mohammed-Coleman <aminmc@gmail.com> wrote:
>>
>> Hi
>>>
>>> I am seeing some strange behaviour with the highlighter and I'm wondering
>>> if anyone else is experiencing this. In certain instances I don't get a
>>> summary being generated. I perform the search and the search returns the
>>> correct document. I can see that the lucene document contains the text in
>>> the field. However after doing:
>>>
>>> SimpleHTMLFormatter simpleHTMLFormatter = new
>>> SimpleHTMLFormatter("<span class=\"highlight\"><b>", "</b></span>");
>>> //required for highlighting
>>> Query query2 = multiSearcher.rewrite(query);
>>> Highlighter highlighter = new Highlighter(simpleHTMLFormatter,
>>> new QueryScorer(query2));
>>> ...
>>>
>>> String text= doc.get(FieldNameEnum.BODY.getDescription());
>>> TokenStream tokenStream =
>>> analyzer.tokenStream(FieldNameEnum.BODY.getDescription(), new
>>> StringReader(text));
>>> String result = highlighter.getBestFragments(tokenStream,
>>> text, 3, "...");
>>>
>>>
>>> the string result is empty. This is very strange, if i try a different
>>> term that exists in the document then I get a summary. For example I have a
>>> word document that contains the term "document" and "aspectj". If I search
>>> for "document" I get the correct document but no highlighted summary.
>>> However if I search using "aspectj" I get the same doucment with
>>> highlighted summary.
>>>
>>> Just to mentioned I do rewrite the original query before performing the
>>> highlighting.
>>>
>>> I'm not sure what i'm missing here. Any help would be appreciated.
>>>
>>> Cheers
>>> Amin
>>>
>>> On Sat, Mar 7, 2009 at 4:32 PM, Amin Mohammed-Coleman <aminmc@gmail.com>
>>> wrote:
>>> Hi
>>>
>>> Got it working! Thanks again for your help!
>>>
>>>
>>> Amin
>>>
>>>
>>> On Sat, Mar 7, 2009 at 12:25 PM, Amin Mohammed-Coleman <aminmc@gmail.com>
>>> wrote:
>>> Thanks! The final piece that I needed to do for the project!
>>>
>>> Cheers
>>>
>>> Amin
>>>
>>> On Sat, Mar 7, 2009 at 12:21 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
>>> > cool. i will use compression and store in index. is there anything
>>> > special
>>> > i need to for decompressing the text? i presume i can just do
>>> > doc.get("content")?
>>> > thanks for your advice all!
>>>
>>> No just use Field.Store.COMPRESS when adding to index and Document.get()
>>> when fetching. The decompression is automatically done.
>>>
>>> You may think, why not enable compression for all fields? The case is,
>>> that
>>> this is an overhead for very small and short fields. So you should only
>>> use
>>> it for large contents (it's the same like compressing very small files as
>>> ZIP/GZIP: These files mostly get larger than without compression).
>>>
>>> Uwe
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>>
>>>
>> ------------------------------------------------------------------------
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG - www.avg.com Version: 8.0.237 / Virus Database:
>> 270.11.10/1995 - Release Date: 03/11/09 08:28:00
>>
>>
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
|