jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: AW: FullText Search Problem
Date Fri, 30 Nov 2007 10:10:31 GMT
KÖLL Claus wrote:
> so either i will filter some characters from the search string or jackrabbit should handle
it.
> i think the second one will be better

JSR 170 specifies a set of characters that need to be escaped if one wishes to 
use them as literal instead of the semantics the spec gives them:

"Within the searchexp literal instances of single quote (“'”), double quote 
(“"”) and hyphen (“-”) must be escaped with a backslash (“\”). Backslash itself

must therefore also be escaped, ending up as double backslash (“\\”)."

Jackrabbit extended this set to provide additional functionality. e.g. you can 
do a fuzzy search: test~

This however means that you need to escape more than the specified set of 
characters. Strictly speaking this is a violation of the spec. But without 
extending this set of characters additional functionality is very difficult to 
implement.

The current set of special characters that need escaping is:

"\\", "+", "-", "!", "(", ")", ":", "^", "[", "]", "\"", "{", "}", "~", "*", "?"

What I propose is to limit the set to only those that are really required. e.g. 
the "!" is equivalent to "-" and the keyword NOT. And then clearly document it.

regards
  marcel

Mime
View raw message