lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wouter Admiraal <...@wadmiraal.net>
Subject When using Dismax, Solr 5.1 tries to compare the entire field to the search string, instead of only using keywords
Date Thu, 04 Jun 2015 07:22:52 GMT
Hi all.

Sorry about the title, but I don't know how to be more explicit than
that. I am updating a Solr 1.4 install to Solr 5.1. I went through all
the changes, updated my schema.xml, etc. Everything works (I
re-indexed instead of migrating the existing one). I can search for
documents, no problem there.

Where I do have a problem is with dismax. It doesn't behave like
before. It must a configuration issue, or maybe I never really
understood how it is supposed to work.

I have 2 documents, which can be summarized as follows:

{
  "label": "Food Inc",
  "keywords": ["Food", "Nutrition"]
}

{
  "label": "Food check online",
  "keywords": ["Internet", "Health"]
}

If I disable dismax and search for "Food" (?q=Food), I find both
documents. So far, so good.

If I turn dismax on and add a boost to the label, I get 0 results
(?q=Food&defType=dismax&qf=label^3.0).

If I turn dismax on and add a boost to the keywords, I get 1 result
("Food Inc", which has a keyword "Food";
?q=Food&defType=dismax&qf=keywords^2.0).

So, from what I understand, it tries to match the search term
*exactly* when enabling dismax, but uses a "contains keyword" logic
when disabling dismax (same for edismax). Which means "Food" !== "Food
Inc" with dismax on.

When I turn on debug, I get the following:

"debug": {
  "rawquerystring": "Food",
  "querystring": "Food",
  "parsedquery": "(+DisjunctionMaxQuery((label:Food^3.0)) ())/no_coord",
  "parsedquery_toString": "+(label:Food^3.0) ()",
  "explain": {},
  "QParser": "DisMaxQParser",
  "altquerystring": null,
  "boostfuncs": null,
  ...
}

I don't understand how/why this doesn't use a "contains" operator.
This was the behavior on the old 1.4 instance. I went through the
changelog for 1.4 to 5.1, but I don't find any explicit information
about dismax behaving differently, except the "mm" parameter needs a
default. I tried many values for mm (including 0, 100%, 100, etc) but
to no avail.

Thanks for your help.

Best regards,

Wouter Admiraal

Mime
View raw message