lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MMach...@LEVI.com
Subject RE: a simple highlight resolvent
Date Thu, 10 Apr 2003 18:02:07 GMT
Hi,
Where I can find the HTMLParser source? Because I found a file named
HTMLParser.jj but I believe that is not the right. In my directory of
lucene-1.2 inside org.apache.lucene.demo.html it isn't.
Thanks,
Michel

-----Original Message-----
From: keelkerr@hotmail.com [mailto:keelkerr@hotmail.com] 
Sent: Thursday, April 10, 2003 7:30 PM
To: Lucene Users List
Subject: Re: a simple highlight resolvent

Hi, I think you can't use it in this way, it's for when you get a file(isn't
a directory), such as test.txt or test.html then highlight "DAMS3020" in
your file. if you want to highlight whole index, you have to get all indexed
file one by one first. But Highlighter only need to constract only once
   query = QueryParser.parse(queryString, "contents", analyzer);
   hl = new org.apache.lucene.demo.HightLighter(query.toString("contents"));
the query.toString("contents") is parsed queryString, such as 
   queryString = (a+b)-c; then  query.toString("contents")  will be as a
Token "a" "+b" "-c".
then call h.getHighlightFile("c:\\temp\\aRealFile") for each file, then get
a highlighted string include your all terms.

In the source, I used HTMLParser to read .html file. If you file isn't
.html, you should change the souce a little
from 
  HTMLParser parser = new HTMLParser(new File(path));
  Reader fr = parser.getReader();
to
  FileReader fr = new FileReader(new File(path));

hope this can help.
         Kerr 

----- Original Message ----- 
From: <MMachado@LEVI.com>
To: <lucene-user@jakarta.apache.org>
Sent: Friday, April 11, 2003 12:45 AM
Subject: RE: a simple highlight resolvent


> Hi,
> Me again, at the final of your program I put a main() method to run the
> application like this:
> 
> public static void main(String[] args) {
>   try {
>   hightLighter h = new hightLighter("DAMS3020");
>   h.getHighlightFile("c:\\myindex");
>   }catch (Exception h) {
>   System.out.println(h.getMessage());
>   }
>   }
> and I received the following message:
> 
> c:\myindex (Access is denied). I don't understand why. Is my folder and
when
> I do a search for the same string (DAMS3020) I receive a link where I can
to
> find the string specified, so I can access, isn't it????. The string is in
> an excel file(is a problem maybe?).
> Thanks in advance if somebody can help me.
> Michel
> 
> -----Original Message-----
> From: Machado, Michel 
> Sent: Thursday, April 10, 2003 3:40 PM
> To: lucene-user@jakarta.apache.org
> Subject: RE: a simple highlight resolvent
> 
> Hi Kerr,
> With your program is possible to highlight a string or word inside a
> document in excel??  I need more clarification please. I will appreciate.
> Thanks
> Michel   
> 
> 
> -----Original Message-----
> From: Hui Ouyang [mailto:hui@triplehop.com] 
> Sent: Thursday, April 10, 2003 3:22 PM
> To: Lucene Users List
> Subject: RE: a simple highlight resolvent
> 
> replaceAll() is only available for jdk14 above.
> 
> -----Original Message----- 
> From: keelkerr@hotmail.com [mailto:keelkerr@hotmail.com] 
> Sent: Thu 4/10/2003 12:09 AM 
> To: Lucene Users List 
> Cc: 
> Subject: a simple highlight resolvent
> 
> 
> 
> hi everyone,
> I am new one in lucene and got a lot help in last 2 weeks. Here I
> want to share a a simple highlight resolvent. Of course I have try
> de.iqcomputing.lucene. Here just a simple way base the demo. Here is the
> source. It's a little inept. And I don't don't why it can't compile if I
use
> String.replaceAll(), So I write another method to handle it. Any
suggestion
> are welcomed.
>           Kerr.
> 
> to use it in results.jsp. --------------------------------
> hl = new
> org.apache.lucene.demo.HightLighter(query.toString("contents"));
> ...
>       summary = hl.getHighlightFile(path);
>       if (summary == null || summary.length() < 100 ){
>          summary = doc.get("summary");
> } catch (Exception e){
>         summary = doc.get("summary");
> }
> ...
> ---------------------------------------
> package org.apache.lucene.demo;
> 
> import java.io.*;
> import java.util.*;
> import org.apache.lucene.demo.html.HTMLParser;
> 
> public class HightLighter {
> 
>   String[] keys;
>   boolean[] b;
> 
>   public HightLighter(String query) {
>     HashMap highlight = new HashMap();
>     StringTokenizer st = new StringTokenizer(query, "\"");
>     String aToken;
>     int off = 0;
>     while(st.hasMoreTokens()){
>       aToken = st.nextToken();
>       char aChar;
>       String dest = "";
>       for(int i=0; i<aToken.length(); i++){
>         aChar = aToken.charAt(i);
>         if (aChar != '+' &&
>             aChar != '-' &&
>             aChar != '(' &&
>             aChar != ')' &&
>             aChar != ' '){
>           dest = dest + aChar;
>         }
>       }
>       if (dest.length() > 0) {
>         highlight.put(Integer.toString(off) , dest);
>         off = off + 1;
>       }
>     }
> 
>     keys = new String[highlight.size()];
>     b = new boolean[highlight.size()];
>     for(int c=0; c<highlight.size(); c++){
>       keys[c] = (String)highlight.get(Integer.toString(c));
>       b[c] = false;
>     }
>   }
> 
>   public String replaceKey(String sourceString, String key) {
>     String destString = "";
>     int i = sourceString.indexOf(key);
>     while ( i != -1 ) {
>       destString = destString + sourceString.substring(0,i)
>                  + "<font color=\"red\">" + key + "</font>";
>       sourceString = sourceString.substring(i+key.length());
>       i = sourceString.indexOf(key);
>     }
>     destString = destString+sourceString;
>     return destString;
>   }
> 
>   public String getHighlight(String string){
>     for(int i=0; i<keys.length; i++){
>       if (string.indexOf(keys[i]) != -1){
>         //string = string.replaceAll(keys[i], "<font color=\"red\">"
> + keys[i] + "</font>");
>         string = replaceKey(string, keys[i]);
>       }
>     }
>     return string;
>   }
> 
>   public String getHighlightFile(String path)
>       throws IOException, InterruptedException  {
> 
>     HTMLParser parser = new HTMLParser(new File(path));
>     Reader fr = parser.getReader();
>     String hlString = this.getHighlight(fr);
>     fr.close();
>     return hlString;
>   }
> 
>   public String getHighlight(Reader reader){
> 
>     try{
>       int size = 5 - keys.length;
>       if ( size > 0) {
>         size = size * 100;
>       } else {
>         size = 100;
>       }
>       char[] buffer = new char[size];
>       StringBuffer last = new StringBuffer("..");
>       String temp;
>       boolean end = false;
>       for(int c=0; c<b.length; c++){
>         b[c] = false;
>       }
> 
>       while ((reader.read(buffer) != -1) && !end){
>         temp = new String(buffer);
>         //System.out.println(temp);
>         //temp = temp.replaceAll("\r\n", " ");
>         for(int i=0; i<keys.length; i++){
>           if (!b[i] && temp.indexOf(keys[i]) != -1){
>             b[i] = true;
>             //temp = temp.replaceAll(keys[i], "<font color=\"red\">"
> + keys[i] + "</font>");
>             last.append("." + temp + "..");
>           }
>         }
>         for(int i=0; i<keys.length; i++){
>           if (!b[i]){
>             end = false;
>             break;
>           } else {
>             end = true;
>           }
>         }
>       }
>       last.append(".");
>       if (last.length() < 100){
>         return null;
>       } else {
>         return(this.getHighlight(last.toString()));
>       }
>     } catch (Exception e){
>       return null;
>     }
>   }
> } 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message