lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Busch (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-881) QueryParser escaping/parsin issue with strings starting/ending with ||
Date Thu, 17 May 2007 17:18:17 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496634
] 

Michael Busch commented on LUCENE-881:
--------------------------------------

You are right Yonik, || is reserved.

The QueryParser itself works correctly:

"|| test ||" yields a ParseException, which is correct because in this case || means OR
"\|\| test \|\|" yields "|| test ||", this is correct, too, because the two | are escaped


The problem here is the escape() method:

  /**
   * Returns a String where those characters that QueryParser
   * expects to be escaped are escaped by a preceding <code>\</code>.
   */
  public static String escape(String s);

It escapes chars like +, -, ! and so on. Example:

escape("++ test ++") yields "\+\+ test \+\+"

but

escape("|| test ||") yields "|| test ||".

I believe to be consistent escape() should escape the two chars | and & as well, no?

> QueryParser escaping/parsin issue with strings starting/ending with ||
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-881
>                 URL: https://issues.apache.org/jira/browse/LUCENE-881
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: QueryParser
>    Affects Versions: 2.1, 2.2
>         Environment: MAC OS X 10.4.7, J2SE 5.0 Release 4
>            Reporter: Slobodan Marjanovic
>         Assigned To: Michael Busch
>            Priority: Trivial
>
> There is a problem with query parser when search string starts/ends with ||.  When string
contains || in the middle like 'something || something' everything runs without a problem.
> Part of code: 
>   searchText = QueryParser.escape(searchText);
>   QueryParser parser = null;
>   parser = new QueryParser(fieldName, new CustomAnalyser());
>   parser.parse(searchText);
> CustomAnalyser class extends Analyser. Here is the only redefined method: 
>     @Override
>     public TokenStream tokenStream(String fieldName, Reader reader) {
>       return new PorterStemFilter( (new StopAnalyzer()).tokenStream(fieldName, reader));
>     }
> I have tested this on Lucene 2.1 and latest source I have checked-out from SVN (Revision
538867) and in both cases parsing exception was thrown.
> Part of Stack Trace (Lucene - SVN checkout - Revision 538867):
> Cannot parse 'someting ||': Encountered "<EOF>" at line 1, column 11.
> Was expecting one of:
>     <NOT> ...
>     "+" ...
>     "-" ...
>     "(" ...
>     "*" ...
>     <QUOTED> ...
>     <TERM> ...
>     <PREFIXTERM> ...
>     <WILDTERM> ...
>     "[" ...
>     "{" ...
>     <NUMBER> ...
>     
>  org.apache.lucene.queryParser.ParseException: Cannot parse 'someting ||': Encountered
"<EOF>" at line 1, column 11.
> Was expecting one of:
>     <NOT> ...
>     "+" ...
>     "-" ...
>     "(" ...
>     "*" ...
>     <QUOTED> ...
>     <TERM> ...
>     <PREFIXTERM> ...
>     <WILDTERM> ...
>     "[" ...
>     "{" ...
>     <NUMBER> ...
>     
>         at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:150)
> Part of Stack Trace (Lucene 2.1):
> Cannot parse 'something ||': Encountered "<EOF>" at line 1, column 12.
> Was expecting one of:
>     <NOT> ...
>     "+" ...
>     "-" ...
>     "(" ...
>     "*" ...
>     <QUOTED> ...
>     <TERM> ...
>     <PREFIXTERM> ...
>     <WILDTERM> ...
>     "[" ...
>     "{" ...
>     <NUMBER> ...
>     
>  org.apache.lucene.queryParser.ParseException: Cannot parse 'something ||': Encountered
"<EOF>" at line 1, column 12.
> Was expecting one of:
>     <NOT> ...
>     "+" ...
>     "-" ...
>     "(" ...
>     "*" ...
>     <QUOTED> ...
>     <TERM> ...
>     <PREFIXTERM> ...
>     <WILDTERM> ...
>     "[" ...
>     "{" ...
>     <NUMBER> ...
>     
>         at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:149)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message