Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 30908 invoked from network); 7 Feb 2008 20:25:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 7 Feb 2008 20:25:17 -0000 Received: (qmail 98794 invoked by uid 500); 7 Feb 2008 20:25:03 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 98764 invoked by uid 500); 7 Feb 2008 20:25:03 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 98755 invoked by uid 99); 7 Feb 2008 20:25:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2008 12:25:03 -0800 X-ASF-Spam-Status: No, hits=-1.0 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wunderwood@netflix.com designates 208.75.77.145 as permitted sender) Received: from [208.75.77.145] (HELO mx2.netflix.com) (208.75.77.145) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Feb 2008 20:24:34 +0000 Received: from message.netflix.com (exchangeav [10.64.32.68]) by mx2.netflix.com (8.12.11.20060308/8.12.11) with ESMTP id m17KTvTj023126 for ; Thu, 7 Feb 2008 12:29:57 -0800 Received: from Superfly.netflix.com ([10.64.32.70]) by message.netflix.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 7 Feb 2008 12:24:36 -0800 Received: from 10.2.164.72 ([10.2.164.72]) by superfly.netflix.com ([10.64.32.70]) with Microsoft Exchange Server HTTP-DAV ; Thu, 7 Feb 2008 20:24:36 +0000 User-Agent: Microsoft-Entourage/11.3.6.070618 Date: Thu, 07 Feb 2008 12:24:52 -0800 Subject: Re: Query with literal quote character: 6'2" From: Walter Underwood To: Message-ID: Thread-Topic: Query with literal quote character: 6'2" Thread-Index: Achpx35kvOguuNW6Edy/WQAUUTF+rA== In-Reply-To: X-Face: 7Vqnb4fOVKsO)3JuUXKxR\M]:e"u'eG`Zue*.((7i7%P%rvZgS[j~95@C-s3i Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-OriginalArrivalTime: 07 Feb 2008 20:24:36.0253 (UTC) FILETIME=[7501F4D0:01C869C7] X-Brightmail-Tracker: AAAAAQAAA+k= X-Language-Identified: TRUE X-Virus-Checked: Checked by ClamAV on apache.org Our users can blow up the parser without special characters. AND THE BAND PLAYED ON TO HAVE AND HAVE NOT Lower-casing in the front end avoids that. We have auto-complete on titles, so the there are plenty of chances to inadvertently use special characters: Romeo + Juliet Airplane! Shrek (Widescreen) We also have people type "--" for a dash in titles. wunder On 2/7/08 12:00 PM, "Chris Hostetter" wrote: > > : How about the query parser respecting backslash escaping? I need > > one of the orriginal design decisions was "no user escaping" ... be able > to take in raw query strings from the user with only '+' '-' and '"' > treated as special characters ... if you allow backslash escaping of those > characters, then by definition '\' becomes a special character too. > > : free-text input, no syntax at all. Right now, I'm escaping every > : Lucene special character in the front end. I just figured out that > : it breaks for colon, can't search for "12:01" with "12\:01". > > yeah ... your '\' character is being taken litterally. you shouldn't do > any escaping if you hand off to dismax. > > the right thing to do is probably to expose more the "query parsing" stuff > as options for hte handler ... let people configure it with what > characters should be escaped, and what should be left alone. We should > also stop using the static utility methods for things like partial > escaping and unbalanced quote striping and start using helper methods > that subclasses can override. > > > -Hoss >