lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Nokleberg <>
Subject Re: Avoiding ParseExceptions
Date Wed, 07 Jun 2006 02:13:10 GMT
On Tue, 06 Jun 2006 14:57:06 -0700, Chris Hostetter wrote:
> I took an approach similar to that, by escaping all of the "special'
> characters except '+', '-', and '"', and then stripping out all quotes if
> there was a non even amount ... this gave me a simplified version of the
> Lucene syntax that was fairly safe.

Actually that is exactly what I wanted (allow '+', '-', and '"' only).

I've attached the code I am using. I'd appreciate it if anyone can find
remaining problems. The escape method should take any string and the
result should be passable to QueryParser without causing a ParseException.

The logic is:
- Lowercase string to remove support for AND and OR (AFAICT they are not
  otherwise escapable)
- Use QueryParser.escape to escape all special characters
- Use a regex to "unescape" only +, -, and '"'
- Count the number of quotes. If it is odd, add a space and a closing
  quote to the end of the query.

It would be nice if QueryParser had methods which would let you choose
which of its features you wanted to enable. I understand that due to the
nature of JavaCC that might be hard to do, though.


  private static final Pattern unescape =

  private static String escape(String s) {
      s = s.toLowerCase();
      s = QueryParser.escape(s);
      s = unescape.matcher(s).replaceAll("$1");
      if ((count(s, '"') & 1) != 0)
          s += " \"";
      return s;

  private static int count(String s, char c) {
      int count = 0;
      for (int i = 0; i < s.length(); i++)
          if (s.charAt(i) == c)
      return count;

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message