lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jack Krupansky" <j...@basetechnology.com>
Subject Re: Solr 4.0alpha: edismax complaints on certain characters
Date Thu, 06 Sep 2012 16:38:21 GMT
The fix in edismax was made just a few days (6/28) before the formal 
announcement of 4.0-ALPHA (7/3), but unfortunately the fix came a few days 
after the cutoff for 4.0-ALPHA (6/25).

See:
https://issues.apache.org/jira/browse/SOLR-3467

(That issue should probably be annotated to indicate that it "affects" 
4.0-ALPHA.)

-- Jack Krupansky

-----Original Message----- 
From: Alexandre Rafalovitch
Sent: Thursday, September 06, 2012 10:13 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.0alpha: edismax complaints on certain characters

I am on 4.0 alpha. Maybe it was fixed in beta. But I am most
definitely seeing this in edismax. If I get rid of / and use
debugQuery, I get:
'responseHeader'=>{
    'status'=>0,
    'QTime'=>14,
    'params'=>{
      'debugQuery'=>'true',
      'indent'=>'true',
      'q'=>'foobar',
      'qf'=>'TitleEN DescEN',
      'wt'=>'ruby',
      'defType'=>'edismax'}},
  'response'=>{'numFound'=>0,'start'=>0,'docs'=>[]
  },
  'debug'=>{
    'rawquerystring'=>'foobar',
    'querystring'=>'foobar',
    'parsedquery'=>'(+DisjunctionMaxQuery((DescEN:foobar |
TitleEN:foobar)))/no_coord',
    'parsedquery_toString'=>'+(DescEN:foobar | TitleEN:foobar)',
    'explain'=>{},
    'QParser'=>'ExtendedDismaxQParser',
....

I'll check beta on my machine by tomorrow.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, Sep 6, 2012 at 10:06 AM, Jack Krupansky <jack@basetechnology.com> 
wrote:
> That's what I was thinking, but when I tried foo/bar in Solr 3.6 and
> 4.0-BETA it was working fine - it split the term and generated the proper
> query without any error.
>
> I think the problem is if you use the default Lucene query parser, not
> edismax. I removed &defType==edismax from my query request and the problem
> reproduces.
>
> My two test queries:
> http://localhost:8983/solr/select/?debugQuery=true&defType=edismax&qf=features&q=foo/bar
> http://localhost:8983/solr/select/?debugQuery=true&df=features&q=foo/bar
>
> The first works; the second fails as reported (in 4.0-BETA, but works in
> 3.6).
>
> -- Jack Krupansky
>
> -----Original Message----- From: Yonik Seeley
> Sent: Thursday, September 06, 2012 9:53 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 4.0alpha: edismax complaints on certain characters
>
>
> I believe this is caused by the regex support in
> https://issues.apache.org/jira/browse/LUCENE-2039
>
> It certainly seems wrong to interpret a slash in the middle of the
> word as the start of a regex, so I've reopened the issue.
>
> -Yonik
> http://lucidworks.com
>
>
> On Thu, Sep 6, 2012 at 9:34 AM, Alexandre Rafalovitch
> <arafalov@gmail.com> wrote:
>>
>> Hello,
>>
>> I was under the impression that edismax was supposed to be crash proof
>> and just ignore bad syntax. But I am either misconfiguring it or hit a
>> weird bug. I basically searched for text containing '/' and got this:
>>
>> {
>>   'responseHeader'=>{
>>     'status'=>400,
>>     'QTime'=>9,
>>     'params'=>{
>>       'qf'=>'TitleEN DescEN',
>>       'indent'=>'true',
>>       'wt'=>'ruby',
>>       'q'=>'foo/bar',
>>       'defType'=>'edismax'}},
>>   'error'=>{
>>     'msg'=>'org.apache.lucene.queryparser.classic.ParseException:
>> Cannot parse \'foo/bar \': Lexical error at line 1, column 9.
>> Encountered: <EOF> after : "/bar "',
>>     'code'=>400}}
>>
>> Is that normal? If it is, is there a known list of characters I need
>> to escape or do I just have to catch the exception and tell user to
>> not do this again?
>>
>> Regards,
>>    Alex.
>>
>> Personal blog: http://blog.outerthoughts.com/
>> LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
>> - Time is the quality of nature that keeps events from happening all
>> at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
>> book)
>
> 


Mime
View raw message