lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: UTF-8/unicode input in querying in Lucene
Date Sat, 15 Sep 2007 00:47:51 GMT

: The page http://lucene.apache.org/java/docs/queryparsersyntax.html does not
: mention that \uNNNN Unicode syntax is supported.
: For example, \u0048\u0045\u004c\u004c\u004f is HELLO.
:  
: Please add this to the page, it took experimentation to discover it.

I don't believe the QueryParser actually treats \uNNNNN as a special 
syntax ... what you may have encountered was that when *javac* parses a 
literal string constant, those sequences have special meaning -- but they 
are already the literal unicode characters long before QueryParser sees 
them.

As far as query parser is concerned the backslash in \uNNNNN is only 
escaping the "u"  (all characters can be escaped, wether they need it or 
not)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message