lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Highlight the searched word when full-text searching performed
Date Mon, 28 Nov 2005 02:03:34 GMT

On 27 Nov 2005, at 00:24, Jerry Stern wrote:
>   I wonder how to highlight the searched word when full-text  
> searching performed based on Lucene.
>   At the indexing stage, the contents of a original file is  
> regarded as a FIELD of a Lucene document:
>    private static void indexFile(File file, IndexWriter idxWriter)
>  throws IOException {
>   System.out.print("Indexing " + file.getCanonicalPath() + " ......");
>
>   Document doc = new Document();
>   doc.add(Field.Text("path", file.getAbsolutePath()));
>   doc.add(Field.Text("contents", new FileReader(file)));
>
>   idxWriter.addDocument(doc);
>
>   System.out.println("indexed.");
>  }
>
>   At the searching stage:
>     Highlighter highlighter = new Highlighter(new QueryScorer(query));
>   for (int i = 0; i < hits.length(); i++)
>   {
>    String text = hits.doc(i).get("contents"); // the text = null.
>
>      TokenStream tokenStream = analyzer.tokenStream("path",
>      new StringReader(text));
>    // Get 3 best fragments and seperate with a "..."
>    String result = highlighter.getBestFragments(tokenStream,
>      text, 3, "...");
>    System.out.println(result);
>   }
>
>   The 'contents' field is not stored in index file, and it is not  
> reasonable to store it in index file. So the red line of code can  
> not get the 'contents' field from the index file.
>   I think that the 'text' parameter for the  
> Highlighter.getBestFragments(..) method must be the context string  
> of the searched word. So my question is how can I get the context  
> string of the searched word?

In your case, you'll need to get the "path" field (since that is  
being stored) and then load the file into a String to pass to the  
highlighter.  The text to highlight must be stored somewhere, and in  
your case it is on the filesystem only.

Be carefuly - you're highlighting all hits there, which will have  
issues if you get a lot of hits.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message