lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 6091] - QueryParser not recognizing asterisk with UTF-8 index
Date Mon, 22 Dec 2003 09:35:38 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6091>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6091

QueryParser not recognizing asterisk with UTF-8 index





------- Additional Comments From tero@favorin.com  2003-12-22 09:35 -------
The following test prog returns: 
Found id: 
1 
Found id: 
1 
Found id: 
 
The last search should find the index 1 as well. Tested with lucene-1.3-rc3. 
 
----------------------------- 
import org.apache.lucene.analysis.*; 
import org.apache.lucene.index.*; 
import org.apache.lucene.document.*; 
import org.apache.lucene.search.*; 
import org.apache.lucene.queryParser.*; 
import java.io.*; 
/** 
 * Self contained test for Lucene indexes.  
 */ 
public class LuceneTest { 
   
  public static void main(String args[]) { 
    String outdirname="/tmp/testidx"; // Index directory 
    try { 
      // Creates index directory, if necessary. 
      File outdir=new File(outdirname); 
      if (!outdir.exists()) 
        outdir.mkdir(); 
      // Create an index with a single document. 
      Analyzer analyzer=new SimpleAnalyzer(); 
      IndexWriter writer = new IndexWriter(outdirname,analyzer,true); 
      addDoc(writer,1, "för"); // The second letter is o with two dots. 
      writer.optimize(); 
      writer.close(); 
      // Search the index. 
      Searcher searcher=new IndexSearcher(outdirname); 
      searchDoc(analyzer,searcher,"för"); // Ok 
      searchDoc(analyzer,searcher,"f*"); // Ok 
      searchDoc(analyzer,searcher,"fö*"); // Wrong! Does not find anything.  
    } catch (Exception e) { 
      e.printStackTrace(); 
      return; 
    } 
  } 
  /** 
   * Add a document to index. 
   * The text is changed to UTF-8. 
   */ 
  private static void addDoc(IndexWriter writer,int id,String text) throws 
Exception { 
    Document doc=new Document(); 
    doc.add(new Field("id",Long.toString(id),true,false,false)); 
    doc.add(new Field("text",new 
String(text.getBytes("UTF-8")),false,true,true)); 
    writer.addDocument(doc); 
  } 
  /** 
   * Search the index. 
   * The text is changed to UTF-8. 
   */ 
  private static void searchDoc(Analyzer analyzer, Searcher searcher, String 
text) throws Exception { 
    Query q=QueryParser.parse(new 
String(text.getBytes("UTF-8")),"text",analyzer); 
    Hits hits=searcher.search(q); 
    System.out.println("Found id:"); 
    for (int i=0;i<hits.length();i++) 
      System.out.println(hits.doc(i).get("id")); 
  }   
}

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message