lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Goetz <>
Subject Re: non-ASCII char search problem with nightly build (12 nov.)
Date Mon, 12 Nov 2001 23:43:52 GMT

>The problem probably lies in the QueryParser class, as it takes only the 
>less significant bytes of the characters given in the query.

Are you sure of that?  I recently switched from using the standard JavaCC 
AsciiCharStream implementation, to using Doug's FastCharStream 
implementation, which should accept two-byte characters.  Are you sure 
you're using the current version?

>I had a very similar problem with querying for polish strings, as they 
>contain characters, that are composed from two bytes in the UTF-8. Also, 
>the chars that appeared in the polish alphabet were not contained in the 
>grammar definition that the query parser accepted.

There are definitely recent fixes for that -- are you sure you're using the 
current version?

Brian Goetz
Quiotix Corporation           Tel: 650-843-1300            Fax: 650-324-8032

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message