lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayank Shrivastava <mayankshrivastava...@gmail.com>
Subject Problems with lucene highlighter
Date Tue, 08 Jun 2010 12:26:33 GMT
Hi,

I am using Lucene Highlighter 2.4.1 for my application. I use the
highlighter to get the best matching fragments, and display them. I
make a call to a function String[]
getFragmentsWithHighlightedTerms(Analyzer analyzer, Query query,
String fieldName, String fieldContents, int fragmentsNumber, int
fragmentSize). For example :

String text = doc.get("MetaData");
getFragmentsWithHighlightedTerms(analyzer, query, "MetaData", Text, 5, 100);

The function getFragmentsWithHighlightedTerms() is defined as follows

private static String[] getFragmentsWithHighlightedTerms( argument list here)
{
   TokenStream stream = TokenSources.getTokenStream(fieldName,
fieldContents, analyzer);
   SpanScorer scorer = new SpanScorer(query, fieldName, new
CachingTokenFilter(stream));
   Fragmenter fragmenter = new SimpleSpanFragmenter(scorer, fragmentSize);

   Highlighter highlighter = new Highlighter(scorer);
   highlighter.setTextFragmenter(fragmenter);
   highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);

   String[] fragments = highlighter.getBestFragments(stream,
fieldContents, fragmentNumber);

   return fragments;
}

Now my trouble is that the highlighter.getBestFragments() method is
returning duplicates. i.e, If i display say the first 5 fragments, no.
1 and 3 are same. I do not quite understand what is causing this. Is
there a problem with the code?

Mime
View raw message