Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 064BAE69E for ; Mon, 25 Feb 2013 22:18:15 +0000 (UTC) Received: (qmail 96530 invoked by uid 500); 25 Feb 2013 22:18:13 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 96447 invoked by uid 500); 25 Feb 2013 22:18:13 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 96402 invoked by uid 99); 25 Feb 2013 22:18:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Feb 2013 22:18:13 +0000 Date: Mon, 25 Feb 2013 22:18:13 +0000 (UTC) From: =?utf-8?Q?Jan_H=C3=B8ydahl_=28JIRA=29?= To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SOLR-4480) EDisMax parser blows up with query containing single plus or minus MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-4480?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D13586= 368#comment-13586368 ]=20 Jan H=C3=B8ydahl commented on SOLR-4480: ----------------------------------- So let's take the String field example. A single %2B crashes the Lucene que= ry parser, and since we just pass it straight through it crashes eDisMax to= o. For the Lucene parser, it crashes for all query strings *ending* in a singl= e "+" http://localhost:8983/solr/select?debug=3Dquery&q=3Dfoo%20%2B but not for queries where there is a whitespace after the "+" http://localhost:8983/solr/select?debug=3Dquery&q=3D%2B%20foo eDismax is a bit different. It does not crash on ending "+" but it swallows= it: http://localhost:8983/solr/select?debug=3Dquery&defType=3Dedismax&df=3Dfoo_= s&q=3D%2B%20hello%20%2B This is due to line 700-703 being too quick at guessing that the + or - mea= ns MUST or NOT {code} if (ch=3D=3D'+' || ch=3D=3D'-') { clause.must =3D ch; pos++; } {code} I'm ok with saying that a single "+" or "-" should mean literal matching (g= iven that field type supports it), and thus we translate '+'->'\+'. But the= n we should do the same for the "+" or "-" at the end of a query string. =20 > EDisMax parser blows up with query containing single plus or minus > ------------------------------------------------------------------ > > Key: SOLR-4480 > URL: https://issues.apache.org/jira/browse/SOLR-4480 > Project: Solr > Issue Type: Bug > Components: query parsers > Reporter: Fiona Tay > Priority: Critical > Fix For: 4.2, 5.0 > > Attachments: SOLR-4480.patch, SOLR-4480.patch > > > We are running solr with sunspot and when we set up a query containing a = single plus, Solr blows up with the following error: > SOLR Request (5.0ms) [ path=3D# parameters=3D{= data: fq=3Dtype%3A%28Attachment+OR+User+OR+GpdbDataSource+OR+HadoopInstance= +OR+GnipInstance+OR+Workspace+OR+Workfile+OR+Tag+OR+Dataset+OR+HdfsEntry%29= &fq=3Dtype_name_s%3A%28Attachment+OR+User+OR+Instance+OR+Workspace+OR+Workf= ile+OR+Tag+OR+Dataset+OR+HdfsEntry%29&fq=3D-%28security_type_name_sm%3A%28D= ataset%29+AND+-instance_account_ids_im%3A%282+OR+1%29%29&fq=3D-%28security_= type_name_sm%3AChorusView+AND+-member_ids_im%3A1+AND+-public_b%3Atrue%29&fq= =3D-%28security_type_name_sm%3A%28Dataset%29+AND+-instance_account_ids_im%3= A%282+OR+1%29%29&fq=3D-%28security_type_name_sm%3AChorusView+AND+-member_id= s_im%3A1+AND+-public_b%3Atrue%29&q=3D%2B&fl=3D%2A+score&qf=3Dname_texts+fir= st_name_texts+last_name_texts+file_name_texts&defType=3Dedismax&hl=3Don&hl.= simple.pre=3D%40%40%40hl%40%40%40&hl.simple.post=3D%40%40%40endhl%40%40%40&= start=3D0&rows=3D3, method: post, params: {:wt=3D>:ruby}, query: wt=3Druby,= headers: {"Content-Type"=3D>"application/x-www-form-urlencoded; charset=3D= UTF-8"}, path: select, uri: http://localhost:8982/solr/select?wt=3Druby, op= en_timeout: , read_timeout: } ] > RSolr::Error::Http (RSolr::Error::Http - 400 Bad Request > Error: org.apache.lucene.queryParser.ParseException: Cannot parse '':= Encountered "" at line 1, column 0. > Was expecting one of: > ... > "+" ... > "-" ... > "(" ... > "*" ... > ... > ... > ... > ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org