lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ritu choudhary <ritu.it...@gmail.com>
Subject Re: highlighting searched results in document
Date Wed, 27 May 2009 08:05:18 GMT
I want to confirm the output of the below statement , what i get into
"result" is just the word i am searching (let's say d word is
registered). How can i get the whole fragment in which the word is
found and show the highlighted word in that fragment or document.

String result =
       highlighter.getBestFragments(tokenStream, text, 5, "...");
   System.out.println("result:" + result);

On 27/05/2009, KK <dioxide.software@gmail.com> wrote:
> Hi ,
> AFAIK, the default option is to bold the matched text. If you want to do
> something else, say highlight it with some color then you have to do that
> instead of doing the default bolding.
> The following is a working example from LIA2ndEdn, [verbatim copy] for hit
> highlighting.
>
> import java.io.*;
> import org.apache.lucene.analysis.SimpleAnalyzer;
> import org.apache.lucene.analysis.standard.StandardAnalyzer;
> import org.apache.lucene.search.TermQuery;
> import org.apache.lucene.search.PhraseQuery;
> import org.apache.lucene.search.highlight.Highlighter;
> import org.apache.lucene.search.highlight.SpanScorer;
> import org.apache.lucene.index.Term;
> import org.apache.lucene.analysis.TokenStream;
> import org.apache.lucene.search.highlight.Highlighter;
> import org.apache.lucene.search.highlight.QueryScorer;
> import org.apache.lucene.search.Scorer;
> import org.apache.lucene.search.highlight.SimpleFragmenter;
> import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
> import org.apache.lucene.search.highlight.Fragmenter;
>
> public class HighlightIt {
>   private static final String text =
>       "Contrary to popular belief, Lorem Ipsum is" +
>       " not simply random text. It has roots in a piece of" +
>       " classical Latin literature from 45 BC, making it over" +
>       " 2000 years old. Richard McClintock, a Latin professor" +
>       " at Hampden-Sydney College in Virginia, looked up one" +
>       " of the more obscure Latin words, consectetur, from" +
>       " a Lorem Ipsum passage, and going through the cites" +
>       " of the word in classical literature, discovered the" +
>       " undoubtable source. Lorem Ipsum comes from sections" +
>       " 1.10.32 and 1.10.33 of \"de Finibus Bonorum et" +
>       " Malorum\" (The Extremes of Good and Evil) by Cicero," +
>       " written in 45 BC. This book is a treatise on the" +
>       " theory of ethics, very popular during the" +
>       " Renaissance. The first line of Lorem Ipsum, \"Lorem" +
>       " ipsum dolor sit amet..\", comes from a line in" +
>       " section 1.10.32."; // from http://www.lipsum.com/
>
>   public static void main(String[] args) throws IOException {
>     String filename = args[0];
>     if (filename == null) {
>       System.err.println("Usage: HighlightIt <filename>");
>       System.exit(-1);
>     }
>     //TermQuery query = new TermQuery(new Term("f", "literature"));
>     PhraseQuery phrase = new PhraseQuery();
>     phrase.add(new Term("f", "lorem"));
>     phrase.add(new Term("f", "ipsum"));
>     phrase.add(new Term("f", "passage"));
>     phrase.setSlop(0);
>
>     QueryScorer scorer = new QueryScorer(phrase);
>
>     SimpleHTMLFormatter formatter =
>         new SimpleHTMLFormatter("<span class=\"highlight\">",
>             "</span>");
>     Highlighter highlighter = new Highlighter(formatter, scorer);
>
>     Fragmenter fragmenter = new SimpleFragmenter(50);
>
>     highlighter.setTextFragmenter(fragmenter);
>
>     TokenStream tokenStream = new StandardAnalyzer()
>         .tokenStream("f", new StringReader(text));
>
>     String result =
>         highlighter.getBestFragments(tokenStream, text, 5, "...");
>     System.out.println("result:" + result);
>
>     //@Ritu, remove the following chunk for your requirement
>
>     FileWriter writer = new FileWriter(filename);
>     writer.write("<html>");
>     writer.write("<style>\n" +
>
>         ".highlight {\n" +
>
>         " background: yellow;\n" +
>         "}\n" +
>         "</style>");
>     writer.write("<body>");
>     writer.write(result);
>     writer.write("</body></html>");
>     writer.close();
>   // remove upto this point
>   }
> }
> --------------
> Make sure you have all the lucene jars in your classpath. As you can see in
> the last part of the code the final output is being written to a file. As
> per your requirement remove that code as well as the part that adds html and
> style tags.
> Now the code adds the highllight span whereeve there is a match. So now
> we've to put the style script in the html page that you are using to see the
> results from browser add the same thing withing <script> </script> tags like
> this
> <script>
> <style>
> .highlight {
> background: yellow
> }
> </style>
> </script>
>
> I hope it will work . If you still have some problems post that.
>
> HTH,
> KK
>
> On Wed, May 27, 2009 at 11:26 AM, Ritu choudhary
> <ritu.itzme@gmail.com>wrote:
>
>> hi there,
>>    I am using lucene highlighter to highlight the searched result
>> but it shows only the query string in bold highlights.
>> IS THERE ANY WAY I CAN USE IT TO SHOW THE HIGHLIGHTED TEXT IN THE
>> DOCUMENT WHERE IT IS FOUND?
>>  I need to show the searched terms in highlights in the
>> document where it is found and i want to do it without using
>> org.apache.lucene.search.Hits
>> Please help. Thanks in advance.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message