lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anand Stephen" <an...@sonic.net>
Subject RE: Escaping lucene key words.
Date Tue, 17 Feb 2004 22:46:12 GMT
For example end users enter 'Java-test' as the input for search.
The Lucene API user would have to escape the '-'
Else you get the following exception
<exception>
	org.apache.lucene.queryParser.ParseException: Encountered
"<EOF>" at 	line 1, column 8.
	Was expecting one of:
	    "(" ...
	    <QUOTED> ...
	    <TERM> ...
	    <PREFIXTERM> ...
	    <WILDTERM> ...
	    <RANGEIN> ...
	    <RANGEEX> ...
	    <NUMBER> ...
	        at
org.apache.lucene.queryParser.QueryParser.generateParseException(Unkno
wn Source)
	        at
org.apache.lucene.queryParser.QueryParser.jj_consume_token(Unknown
Source)
      	  at org.apache.lucene.queryParser.QueryParser.Clause(Unknown
Source)
	        at
org.apache.lucene.queryParser.QueryParser.Query(Unknown 	Source)
      	  at org.apache.lucene.queryParser.QueryParser.parse(Unknown
Source)
      	  at org.apache.lucene.queryParser.QueryParser.parse(Unknown
Source)
      	  at 
<exception>

To avoid this I escape the input by using this method.

<method>

public static escape(String query){
        String[] reservedChars = new String[]{"AND", "OR", "NOT", "+",
"-", "||", "&&",
                                              "!", "(", ")", "{", "}",
"^", "~", "*", "?", ":", "\\"   };

        for (int i = 0 ,j = reservedChars.length; i < j; i++) {
            if (query.indexOf(reservedChars[i]) >= 0) {

               final String reservedChar = reservedChars[i];
                final String s =  String.valueOf('\\') + reservedChar ;
            

query = StringUtils.replace(query, reservedChar, s);
            }
        }
//        query = StringUtils.lowerCase(query);

	return query
}

</method>

Was this a good idea? If so would it be helpful to others?

--- Anand Stephen



-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Sunday, February 15, 2004 4:11 AM
To: Lucene Developers List
Subject: Re: Escaping lucene key words.

On Feb 14, 2004, at 10:55 PM, Anand Stephen wrote:
> (i)  Is there an utility method to escape all Lucene keywords.  
> Something along the lines of this method
>       
> http://jakarta.apache.org/commons/lang/api/org/apache/commons/lang/ 
> StringEscapeUtils.html#escapeXml(java.lang.String)
>
> (ii)  If the answer to (i)  is yes where can I find it?
>
> (iii)  If the answer to (i) is no.
>     Would it be a good idea to provide Lucene user with a additional

> utility class(es) in the "util" package that does     work some work  
> like escape chars and any other methods that would make the Lucene  
> user's life even more easier?
> I am willing to contribute code for this method.

Stephen - where are you feeling the need to have this type of escaping  
with Lucene?  Could you show us where there is a problem?

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message