lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Taylor <paul_t...@fastmail.fm>
Subject Query parser fails on Hangul/Korean
Date Sat, 22 Aug 2009 11:17:43 GMT
public class Issue3341Test extends TestCase {

public void testMatchHangul() throws Exception {
Analyzer analyzer = new StandardAnalyzer();
RAMDirectory dir = new RAMDirectory();
IndexWriter writer = new IndexWriter(dir, analyzer, true, 
IndexWriter.MaxFieldLength.LIMITED);
Document doc = new Document();
doc.add(new Field("name", "키드갱", Field.Store.YES, Field.Index.ANALYZED));
writer.addDocument(doc);
writer.close();

IndexSearcher searcher = new IndexSearcher(dir,true);
Query q = new QueryParser("name", analyzer).parse("키드갱");
System.out.println(q.toString());


Hits hits = searcher.search(q);
assertEquals(1, hits.length());
}

}

gives the following:

org.apache.lucene.queryParser.ParseException: Cannot parse '???': '*' or 
'?' not allowed as first character in WildcardQuery
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:181)
at 
org.musicbrainz.search.analysis.Issue3341Test.testMatchHangul(Issue3341Test.java:32)

Why does the parser think its a wildcard.
(I'm just using the standard analyser, because the search could be 
performed in any language, but the user doesnt specify the language so 
we don't know what analyser to use. But thats okay I dont expect lucene 
to do anything clever but I would expect a match when index and query 
are identical.)


thanks Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message