lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <k...@r.email.ne.jp>
Subject Re: Highlighter doesn't return any fragments for terms towards the end of the document
Date Fri, 15 Jul 2011 00:11:49 GMT
(11/07/15 8:23), SBS wrote:
> I have come across a somewhat baffling problem.  I am indexing HTML documents
> and one of them is larger than the rest at about 200K.  For some reason when
> I search for terms which occur only towards the end of the document (i.e.
> after some apparent "cutoff" point in the document), the document itself is
> returned as a match but when I call Highlighter#getBestFragments() it
> returns an empty array.  This same method returns fragments if the terms
> occur in the first part of the document.
>
> So, am I running into some size limitation in either documents or fragments?
> What else could be causing this behaviour?

There is a limitation. Try to set the following parameter to high (default is 50*1024):

http://lucene.apache.org/java/3_3_0/api/all/org/apache/lucene/search/highlight/Highlighter.html#setMaxDocCharsToAnalyze%28int%29

koji
-- 
http://www.rondhuit.com/en/

Mime
View raw message