lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Goetz <br...@quiotix.com>
Subject Re: non-ASCII char search problem with nightly build (12 nov.)
Date Mon, 12 Nov 2001 23:43:52 GMT

>The problem probably lies in the QueryParser class, as it takes only the 
>less significant bytes of the characters given in the query.

Are you sure of that?  I recently switched from using the standard JavaCC 
AsciiCharStream implementation, to using Doug's FastCharStream 
implementation, which should accept two-byte characters.  Are you sure 
you're using the current version?

>I had a very similar problem with querying for polish strings, as they 
>contain characters, that are composed from two bytes in the UTF-8. Also, 
>the chars that appeared in the polish alphabet were not contained in the 
>grammar definition that the query parser accepted.

There are definitely recent fixes for that -- are you sure you're using the 
current version?



--
Brian Goetz
Quiotix Corporation
brian@quiotix.com           Tel: 650-843-1300            Fax: 650-324-8032

http://www.quiotix.com


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message