lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claire Pollard <claire.poll...@imagen.io>
Subject RE: Edismax ignoring queries containing booleans
Date Thu, 09 Jan 2020 10:22:50 GMT
Hey Edward,

Thanks for the tips. 😊

I've cleaned up my solrconfig, removed the duplicate df, tabs and newlines, and tried commenting
out the bits you've suggested and adding them back in bit by bit, and it seems mm was the
thing which is breaking the query for me.

Without it, the query returns 2 documents as expected.

"debug":{
    "rawquerystring":"recordID:(18 OR 19 OR 20)",
    "querystring":"recordID:(18 OR 19 OR 20)",
    "parsedquery":"+((recordID:[18 TO 18]) (recordID:[19 TO 19]) (recordID:[20 TO 20])) DisjunctionMaxQuery(((text:\"19
20\"~100)^0.2 | (annotations:\"19 20\"~100)^0.6 | (collectionTitle:\"19 20\"~100)^2.0 | collectionDescription:\"19
20\"~100 | (title:\"19 20\"~100)^2.1 | (Test_FR:\"19 20\"~100)^1.1 | (Test_DE:\"19 20\"~100)^1.1
| (Test_AR:\"19 20\"~100)^1.1))",
    "parsedquery_toString":"+(recordID:[18 TO 18] recordID:[19 TO 19] recordID:[20 TO 20])
((text:\"19 20\"~100)^0.2 | (annotations:\"19 20\"~100)^0.6 | (collectionTitle:\"19 20\"~100)^2.0
| collectionDescription:\"19 20\"~100 | (title:\"19 20\"~100)^2.1 | (Test_FR:\"19 20\"~100)^1.1
| (Test_DE:\"19 20\"~100)^1.1 | (Test_AR:\"19 20\"~100)^1.1)",
    "explain":{
      "2CBF8A49-CA2D-4e42-88F2-3790922EF415":"\n1.0 = sum of:\n  1.0 = sum of:\n    1.0 =
recordID:[19 TO 19]\n",
      "F73CFBC7-2CD2-4aab-B8C1-9D19D427EAFB":"\n1.0 = sum of:\n  1.0 = sum of:\n    1.0 =
recordID:[20 TO 20]\n"},

The only visual difference I think is the ~2 which came after the initial part of the parsed
query:

Old Query start: +((recordID:[18 TO 18]) (recordID:[19 TO 19]) (recordID:[20 TO 20]))~2
New Query start: +((recordID:[18 TO 18]) (recordID:[19 TO 19]) (recordID:[20 TO 20]))

There shouldn't be a problem using mm with edismax right? Or does the problem lie with the
structure of my qf/pf and then adding mm?

Cheers,
Claire.

-----Original Message-----
From: Edward Ribeiro <edward.ribeiro@gmail.com> 
Sent: 09 January 2020 02:28
To: solr-user@lucene.apache.org
Subject: Re: Edismax ignoring queries containing booleans

Hi Claire,

Unfortunately I didn't see anything in the debug explain that could potentially be the source
of the problem. As Saurabh, I tested on a core and it worked for me.

I suggest that you simplify the solrconfig (commenting out qf, mm, spellchecker config and
pf, for example) and reload the core. If the query works then you  reinsert the config one
by one, reloading the core and see if the query works.

A few remarks based on a snippet of the solrconfig you posted on a previous
e-mail:

* Your solrconfig.xml defines df two times (the debug shows "df":["text", "text"]);

* There are a couple codes like &#x09;
&#x0D; and &#x0A; It would be nice to remove It;

Please, let us know if you find why. :)

Best,
Edward


Em qua, 8 de jan de 2020 13:00, Claire Pollard <claire.pollard@imagen.io>
escreveu:

> It would be lovely to be able to use range to complete my searches, 
> but sadly documents aren't necessarily sequential so I might want say 
> 18, 24 or
> 30 in future.
>
> I've re-run the query with debug on. Is there anything here that looks 
> unusual? Thanks.
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":75,
>     "params":{
>       "mm":"\r\n       0<1 2<-1 5<-2 6<90%\r\n      ",
>       "spellcheck.collateExtendedResults":"true",
>       "df":["text",
>         "text"],
>       "q.alt":"*:*",
>       "ps":"100",
>       "spellcheck.dictionary":["default",
>         "wordbreak"],
>       "bf":"",
>       "echoParams":"all",
>       "fl":"*,score",
>       "spellcheck.maxCollations":"5",
>       "rows":"10",
>       "spellcheck.alternativeTermCount":"5",
>       "spellcheck.extendedResults":"true",
>       "q":"recordID:(18 OR 19 OR 20)",
>       "defType":"edismax",
>       "spellcheck.maxResultsForSuggest":"5",
>       "qf":"\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\ttext^0.4 recordID^10.0
> annotations^0.5 collectionTitle^1.9 collectionDescription^0.9 
> title^2.0
> Test_FR^1.0 Test_DE^1.0 Test_AR^1.0 genre^1.0 genre_fr^1.0 
> french2^1.0\r\n\n\t\t\t\t\n\t\t\t",
>       "spellcheck":"on",
>       "pf":"\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\ttext^0.2 recordID^10.0
> annotations^0.6 collectionTitle^2.0 collectionDescription^1.0 
> title^2.1
> Test_FR^1.1 Test_DE^1.1 Test_AR^1.1 genre^1.1 genre_fr^1.1 
> french2^1.1\r\n\n\t\t\t\t\n\t\t\t",
>       "spellcheck.count":"10",
>       "debugQuery":"on",
>       "_":"1578499092576",
>       "spellcheck.collate":"true"}},
>   "response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
>   },
>   "spellcheck":{
>     "suggestions":[],
>     "correctlySpelled":false,
>     "collations":[]},
>   "debug":{
>     "rawquerystring":"recordID:(18 OR 19 OR 20)",
>     "querystring":"recordID:(18 OR 19 OR 20)",
>     "parsedquery":"+((recordID:[18 TO 18]) (recordID:[19 TO 19])
> (recordID:[20 TO 20]))~2 DisjunctionMaxQuery(((text:\"19 20\"~100)^0.2 
> |
> (annotations:\"19 20\"~100)^0.6 | (collectionTitle:\"19 20\"~100)^2.0 
> |
> collectionDescription:\"19 20\"~100 | (title:\"19 20\"~100)^2.1 |
> (Test_FR:\"19 20\"~100)^1.1 | (Test_DE:\"19 20\"~100)^1.1 | 
> (Test_AR:\"19 20\"~100)^1.1))",
>     "parsedquery_toString":"+((recordID:[18 TO 18] recordID:[19 TO 19]
> recordID:[20 TO 20])~2) ((text:\"19 20\"~100)^0.2 | (annotations:\"19
> 20\"~100)^0.6 | (collectionTitle:\"19 20\"~100)^2.0 |
> collectionDescription:\"19 20\"~100 | (title:\"19 20\"~100)^2.1 |
> (Test_FR:\"19 20\"~100)^1.1 | (Test_DE:\"19 20\"~100)^1.1 | 
> (Test_AR:\"19 20\"~100)^1.1)",
>     "explain":{},
>     "QParser":"ExtendedDismaxQParser",
>     "altquerystring":null,
>     "boost_queries":null,
>     "parsed_boost_queries":[],
>     "boostfuncs":[""],
>     "timing":{
>       "time":75.0,
>       "prepare":{
>         "time":35.0,
>         "query":{
>           "time":35.0},
>         "facet":{
>           "time":0.0},
>         "facet_module":{
>           "time":0.0},
>         "mlt":{
>           "time":0.0},
>         "highlight":{
>           "time":0.0},
>         "stats":{
>           "time":0.0},
>         "expand":{
>           "time":0.0},
>         "terms":{
>           "time":0.0},
>         "spellcheck":{
>           "time":0.0},
>         "debug":{
>           "time":0.0}},
>       "process":{
>         "time":38.0,
>         "query":{
>           "time":29.0},
>         "facet":{
>           "time":0.0},
>         "facet_module":{
>           "time":0.0},
>         "mlt":{
>           "time":0.0},
>         "highlight":{
>           "time":0.0},
>         "stats":{
>           "time":0.0},
>         "expand":{
>           "time":0.0},
>         "terms":{
>           "time":0.0},
>         "spellcheck":{
>           "time":6.0},
>         "debug":{
>           "time":1.0}}}}}
>
> -----Original Message-----
> From: Edward Ribeiro <edward.ribeiro@gmail.com>
> Sent: 07 January 2020 01:05
> To: solr-user@lucene.apache.org
> Subject: Re: Edismax ignoring queries containing booleans
>
> Hi Claire,
>
> You can add the following parameter `&debug=all` on the URL to bring 
> back debugging info and share with us (if you are using the Solr admin 
> UI you should check the `debugQuery` checkbox).
>
> Also, if you are searching a sequence of values you could perform a 
> range
> query: recordID:[18 TO 20]
>
> Best,
> Edward
>
> On Mon, Jan 6, 2020 at 10:46 AM Claire Pollard 
> <claire.pollard@imagen.io>
> wrote:
> >
> > Ok... It doesn't work for me. I'm fairly new to Solr so any help 
> > would be
> appreciated!
> >
> > My managed-schema field and field type look like this:
> >
> > <field name="recordID" type="long" indexed="true" stored="true"
> required="true" multiValued="false" />
> > <fieldType name="long" class="solr.LongPointField" sortMissingLast="true"
> omitNorms="true" />
> >
> > And my solrconfig.xml select/query handlers look like this:
> >
> >         <requestHandler name="/select" class="solr.SearchHandler">
> >                 <lst name="defaults">
> >                         <str name="echoParams">all</str>
> >                         <!-- Query settings -->
> >                         <str name="defType">edismax</str>
> >                         <str name="qf">
> >                                 &#x09;text^0.4 recordID^10.0
> annotations^0.5 collectionTitle^1.9 collectionDescription^0.9 
> title^2.0
> Test_FR^1.0 Test_DE^1.0 Test_AR^1.0 genre^1.0 genre_fr^1.0 
> french2^1.0&#x0D;&#x0A;
> >                         </str>
> >                         <str name="df">text</str>
> >                         <str name="q.alt">*:*</str>
> >                         <str name="rows">10</str>
> >                         <str name="fl">*,score</str>
> >                         <str name="pf">
> >                                 &#x09;text^0.2 recordID^10.0
> annotations^0.6 collectionTitle^2.0 collectionDescription^1.0 
> title^2.1
> Test_FR^1.1 Test_DE^1.1 Test_AR^1.1 genre^1.1 genre_fr^1.1 
> french2^1.1&#x0D;&#x0A;</str>
> >                         <str name="bf" />
> >                         <str name="mm">&#x0D;&#x0A;       0&lt;1
2&lt;-1
> 5&lt;-2 6&lt;90%&#x0D;&#x0A;      </str>
> >                         <int name="ps">100</int>
> >                         <!--SpellChecking -->
> >                         <str name="df">text</str>
> >                         <!-- Solr will use suggestions from both the
> 'default' spellchecker
> >      and from the 'wordbreak' spellchecker and combine them.
> >      collations (re-written queries) can include a combination of
> >      corrections from both spellcheckers -->
> >                         <str name="spellcheck.dictionary">default</str>
> >                         <str name="spellcheck.dictionary">wordbreak</str>
> >                         <str name="spellcheck">on</str>
> >                         <str name="spellcheck.extendedResults">true</str>
> >                         <str name="spellcheck.count">10</str>
> >                         <str
> name="spellcheck.alternativeTermCount">5</str>
> >                         <str
> name="spellcheck.maxResultsForSuggest">5</str>
> >                         <str name="spellcheck.collate">true</str>
> >                         <str
> name="spellcheck.collateExtendedResults">true</str>
> >                         <str name="spellcheck.maxCollations">5</str>
> >                 </lst>
> >                 <arr name="last-components">
> >                         <str>spellcheck</str>
> >                 </arr>
> >                 <!-- In addition to defaults, "appends" params can 
> > be
> specified
> >          to identify values which should be appended to the list of
> >          multi-val params from the query (or the existing "defaults").
> >       -->
> >         </requestHandler>
> >
> >         <requestHandler name="/query" class="solr.SearchHandler">
> >                 <lst name="defaults">
> >                         <str name="echoParams">explicit</str>
> >                         <str name="wt">json</str>
> >                         <str name="indent">true</str>
> >                         <str name="df">text</str>
> >                 </lst>
> >         </requestHandler>
> >
> > Is there anything else that might be useful in helping diagnose 
> > what's
> going wrong for me?
> >
> > Cheers,
> > Claire.
> >
> > -----Original Message-----
> > From: Saurabh Sharma <saurabh.infoedge@gmail.com>
> > Sent: 06 January 2020 11:20
> > To: solr-user@lucene.apache.org
> > Subject: Re: Edismax ignoring queries containing booleans
> >
> > It should work well. I have just tested the same with 8.3.0.
> >
> > Thanks
> > Saurabh Sharma
> >
> > On Mon, Jan 6, 2020, 4:31 PM Claire Pollard 
> > <claire.pollard@imagen.io>
> > wrote:
> >
> > > I'm using:
> > >
> > > recordID:(18 OR 19 OR 20)
> > >
> > > Which should return 2 records (as 18 doesn't exist), but it 
> > > returns
> none.
> > > recordID is a LongPointField (sorry I said Int in my previous message).
> > >
> > > -----Original Message-----
> > > From: Saurabh Sharma <saurabh.infoedge@gmail.com>
> > > Sent: 06 January 2020 10:35
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Edismax ignoring queries containing booleans
> > >
> > > Please share the query which you are creating.
> > >
> > > On Mon, Jan 6, 2020, 3:52 PM Claire Pollard 
> > > <claire.pollard@imagen.io>
> > > wrote:
> > >
> > > > In Solr 8.3.0 I've got an edismax query parser in my search 
> > > > handler, and it seems to be ignoring Boolean operators such as 
> > > > AND and OR when searching using an IntPointField.
> > > >
> > > > I was hoping to use a query to this field to return a batch of 
> > > > documents with non-sequential IDs, so a range would be inappropriate.
> > > >
> > > > We had a previous 4.10.2 instance of Solr which uses the now 
> > > > deprecated Trie fields, and these seem to search without issue 
> > > > using
> > > boolean operators.
> > > >
> > > > Is there something extra I need to do with my setup for 
> > > > PointFields to use booleans or should they work as default.
> > > >
> > > > Cheers,
> > > > Claire.
> > > >
> > >
>
>
Mime
View raw message