lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: QueryParser handling of backslash characters
Date Wed, 20 Jul 2005 19:37:45 GMT

On Jul 19, 2005, at 11:19 AM, Jeff Davis wrote:

> Hi,
> I'm seeing some strange behavior in the way the QueryParser handles
> consecutive backslash characters.  I know that backslash is the escape
> character in Lucene, and so I would expect "\\\\" to match fields that
> have two consecutive backslashes, but this does not seem to be the
> case.
> The fields I'm searching are UNC paths, e.g. "\\\public".
> The only way I can get my query to find the record containing that
> value is to type "FieldName:\\\\\public" (three slashes).
> Why is the third backslash character not treated as an escape?  Is it
> just that any backslash that is preceded by a backslash is interpreted
> as a literal backslash character, regardless of whether the "escape"
> backslash was itself escaped?
> I can code around this, but it seems inconsistent with the way that
> escape characters usually work.  Is this a bug, or is it intentional,
> or am I missing something?

I've waited until I had a chance to experiment with this before  
replying.  I say that this is a bug.  There is a private method in  
QueryParser called discardEscapeChar (shown below).  I copied it to a  
JUnit test case and gave it this assert:

     assertEquals("\\\\\\\\\\\\public", discardEscapeChar 

This test fails with:

     Actual  :\\public

Which is wrong in my opinion.  (though my head hurts thinking about  
metaescaping backslashes in Java code to make this a proper test)

The bug is isolated to the discardEscapeChar() method where it eats  
too many backslashes.  Could you have a shot at tweaking that method  
to do the right thing and submit a patch?

   private String discardEscapeChar(String input) {
     char[] caSource = input.toCharArray();
     char[] caDest = new char[caSource.length];
     int j = 0;
     for (int i = 0; i < caSource.length; i++) {
       if ((caSource[i] != '\\') || (i > 0 && caSource[i-1] == '\\')) {
     return new String(caDest, 0, j);


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message