lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walter Underwood <wunderw...@netflix.com>
Subject Re: Query with literal quote character: 6'2"
Date Thu, 07 Feb 2008 20:24:52 GMT
Our users can blow up the parser without special characters.

  AND THE BAND PLAYED ON
  TO HAVE AND HAVE NOT

Lower-casing in the front end avoids that.

We have auto-complete on titles, so the there are plenty
of chances to inadvertently use special characters:

  Romeo + Juliet
  Airplane! 
  Shrek (Widescreen)

We also have people type "--" for a dash in titles.

wunder

On 2/7/08 12:00 PM, "Chris Hostetter" <hossman_lucene@fucit.org> wrote:

> 
> : How about the query parser respecting backslash escaping? I need
> 
> one of the orriginal design decisions was "no user escaping" ... be able
> to take in raw query strings from the user with only '+' '-' and '"'
> treated as special characters ... if you allow backslash escaping of those
> characters, then by definition '\' becomes a special character too.
> 
> : free-text input, no syntax at all. Right now, I'm escaping every
> : Lucene special character in the front end. I just figured out that
> : it breaks for colon, can't search for "12:01" with "12\:01".
> 
> yeah ... your '\' character is being taken litterally.  you shouldn't do
> any escaping if you hand off to dismax.
> 
> the right thing to do is probably to expose more the "query parsing" stuff
> as options for hte handler ... let people configure it with what
> characters should be escaped, and what should be left alone.  We should
> also stop using the static utility methods for things like partial
> escaping and unbalanced quote striping and start using helper methods
> that subclasses can override.
> 
> 
> -Hoss
> 


Mime
View raw message