lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Davis <jada...@gmail.com>
Subject Re: QueryParser handling of backslash characters
Date Wed, 20 Jul 2005 21:18:00 GMT
That fix works perfectly, as far as I can tell.

As for the unit test, it should actually be:
assertEquals("\\\\192.168.0.15\\public", discardEscapeChar
("\\\\\\\\192.168.0.15\\\\public"));

Jeff


On 7/20/05, Eyal <eyal.junk@gmail.com> wrote:
> I think this should work:
> 
> (Written in C# originally - so someone please check if it compiles - I don't
> have a java compiler here)
> 
>     private String discardEscapeChar(String input)
>     {
>       char[] caSource = input.toCharArray();
>       char[] caDest = new char[caSource.length];
>       int j = 0;
> 
>       for (int i = 0; i < caSource.length; i++)
>       {
>         if (caSource[i] == '\\')
>         {
>           if (caSource.length == ++i)
>             break;
>         }
>         caDest[j++]=caSource[i];
>       }
>       return new String(caDest, 0, j);
>     }
> 
> 
> Regarding your UnitTest - It think it's wrong:
> 
> >      assertEquals("\\\\\\\\192.168.0.15\\\\public",
> > discardEscapeChar ("\\\\192.168.0.15\\\\public"));
> 
> It should be: assertEquals("\\\\192.168.0.15\\\\public", discardEscapeChar
> ("\\\\\\\\192.168.0.15\\\\public"));
> 
> I would also suggest to add the following:
> String s="\\\\some.host.name\\dir+:+-!():^[]\{}~*?";
> assertEquals(s,discardEscapeChar(escape(s)));
> 
> Eyal
> 
> > -----Original Message-----
> > From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> > Sent: Wednesday, July 20, 2005 22:38 PM
> > To: java-user@lucene.apache.org
> > Subject: Re: QueryParser handling of backslash characters
> >
> >
> > On Jul 19, 2005, at 11:19 AM, Jeff Davis wrote:
> >
> > > Hi,
> > >
> > > I'm seeing some strange behavior in the way the QueryParser handles
> > > consecutive backslash characters.  I know that backslash is
> > the escape
> > > character in Lucene, and so I would expect "\\\\" to match
> > fields that
> > > have two consecutive backslashes, but this does not seem to be the
> > > case.
> > >
> > > The fields I'm searching are UNC paths, e.g.
> > "\\192.168.0.15\public".
> > > The only way I can get my query to find the record containing that
> > > value is to type "FieldName:\\\192.168.0.15\\public" (three
> > slashes).
> > > Why is the third backslash character not treated as an
> > escape?  Is it
> > > just that any backslash that is preceded by a backslash is
> > interpreted
> > > as a literal backslash character, regardless of whether the "escape"
> > > backslash was itself escaped?
> > >
> > > I can code around this, but it seems inconsistent with the way that
> > > escape characters usually work.  Is this a bug, or is it
> > intentional,
> > > or am I missing something?
> >
> > I've waited until I had a chance to experiment with this
> > before replying.  I say that this is a bug.  There is a
> > private method in QueryParser called discardEscapeChar (shown
> > below).  I copied it to a JUnit test case and gave it this assert:
> >
> >      assertEquals("\\\\\\\\192.168.0.15\\\\public",
> > discardEscapeChar ("\\\\192.168.0.15\\\\public"));
> >
> > This test fails with:
> >
> >      Expected:\\\\192.168.0.15\\public
> >      Actual  :\192.168.0.15\public
> >
> > Which is wrong in my opinion.  (though my head hurts thinking
> > about metaescaping backslashes in Java code to make this a
> > proper test)
> >
> > The bug is isolated to the discardEscapeChar() method where
> > it eats too many backslashes.  Could you have a shot at
> > tweaking that method to do the right thing and submit a patch?
> >
> >    private String discardEscapeChar(String input) {
> >      char[] caSource = input.toCharArray();
> >      char[] caDest = new char[caSource.length];
> >      int j = 0;
> >      for (int i = 0; i < caSource.length; i++) {
> >        if ((caSource[i] != '\\') || (i > 0 && caSource[i-1]
> > == '\\')) {
> >          caDest[j++]=caSource[i];
> >        }
> >      }
> >      return new String(caDest, 0, j);
> >    }
> >
> > Erik
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message