lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olivier Catteau" <ocatteau_luc...@hotmail.com>
Subject How to use Highlighter concretly ?
Date Wed, 23 Jun 2004 15:19:10 GMT
Hi !

I'd like to use the Highlighter class to show a summury highlighted after a search. But I
don't know how to use correctly the Highlighter class.
I found this piece of code which works well.


--------------------------------------------------------------------------------

public class TestHighlighter {

public static void main(String[] args) {

try {

Analyzer a = new StandardAnalyzer();

Query q = QueryParser.parse("jennifer lopez", "cached", a);

String s =

"the unofficial home page Britney Spears Elizabeth Hurley Kirsten Dunst "

+ "Anna Kournikova Katie Holmes Katherine Heigl Jessica Alba Alyson Hannigan Jennifer "

+ "Lopez Sarah Michelle Gellar";

Highlighter highlighter = new Highlighter(new QueryScorer(q));

TokenStream tokenstream =

a.tokenStream("cached", new java.io.StringReader(s));

String summary = highlighter.getBestFragments(tokenstream, s, 2, "...");


System.out.println("summary : " + summary);

} catch(Exception e) {

e.printStackTrace();

}

}

}


--------------------------------------------------------------------------------


But I don't know how to adapt it. In fact, I've made a search and I get a Hits instance. And
now, I want to give a highlighted summury of each documents of the hits. So it must looks
like this :

--------------------------------------------------------------------------------

Highlighter highlighter;

TokenStream tokenstream;


for (int i = 0; i < hits.length(); i++) {

Document doc = hits.doc(i);


String contents = I DON'T KNOW HOW TO GET THE CONTENTS OF MY DOC


highlighter = new Highlighter(new QueryScorer(query));

tokenstream = analyzer.tokenStream("contents", new java.io.StringReader(contents));

String summary = highlighter.getBestFragments(tokenstream, contents, 2, "...");

System.out.println("summary : " + summary);

}


--------------------------------------------------------------------------------


Here is my questions. First, is it the good method to get a highlighted summury ? And if it
is, how is the best way to get the contents of my document (the same way that I used to index
their contents or another way ?) ?

(To be more precise, I use Lucene to index PDF, DOC, TXT. The size of these document could
be about 5Mo.)

Thanks.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message