lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DisMaxRequestHandler" by OtisGospodnetic
Date Wed, 05 Mar 2008 21:10:45 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by OtisGospodnetic:
http://wiki.apache.org/solr/DisMaxRequestHandler

The comment on the change is:
Typos, syntax, etc. cleanup

------------------------------------------------------------------------------
- The DisMaxRequestHandler is designed to process simple user entered phrases (without heavy
syntax) and search for the individual words across several fields using different weighting
based on the significance of each field.  Additional options let you influence the score based
on rules specific to each use case (independent of user input)
+ The DisMaxRequestHandler is designed to process simple user entered phrases (without heavy
syntax) and search for the individual words across several fields using different weighting
(boosts) based on the significance of each field.  Additional options let you influence the
score based on rules specific to each use case (independent of user input)
  
  
  [[TableOfContents]]
@@ -8, +8 @@

  
  == Overview ==
  
- This query handler supports an extremely simplified subset of the Lucene !QueryParser syntax.
 Quotes can be used to group phrases, and +/- can be used to denote mandatory and optional
clauses ... but all other Lucene query parser special characters are escaped to simplify the
user experience.  The handler takes responsibility for building a good query from the user's
input using !BooleanQueries containing !DisjunctionMaxQueries across fields and boosts you
specify It also allows you to provide additional boosting queries, boosting functions, and
filtering queries to artificially affect the outcome of all searches.  These options can all
be specified as default  parameters for the handler in your solrconfig.xml or overridden the
Solr query URL.
+ This query handler supports an extremely simplified subset of the Lucene !QueryParser syntax.
 Quotes can be used to group phrases, and +/- can be used to denote mandatory and optional
clauses ... but all other Lucene query parser special characters are escaped to simplify the
user experience.  The handler takes responsibility for building a good query from the user's
input using !BooleanQueries containing !DisjunctionMaxQueries across fields and boosts you
specify.  It also lets you provide additional boosting queries, boosting functions, and filtering
queries to artificially affect the outcome of all searches.  These options can all be specified
as default parameters for the handler in your solrconfig.xml or overridden in the Solr query
URL.
  
  == Parameters ==
  
@@ -21, +21 @@

  
  === q ===
  
- The guts of the search defining the main "query".  This is designed to be support raw input
strings provided by users with no special escaping.   '+' and '-' characters are treated as
"mandatory" and "prohibited" modifiers for the subsequent terms.  Text wrapped in balance
quote characters '"' are treated as phrases, any query containing an odd number of quote characters
is evaluated as if there were no quote characters at all.  Wildcards in this "q" parameter
are not supported.
+ The guts of the search defining the main "query".  This is designed to be support raw input
strings provided by users with no special escaping.   '+' and '-' characters are treated as
"mandatory" and "prohibited" modifiers for the subsequent terms.  Text wrapped in balanced
quote characters '"' are treated as phrases, any query containing an odd number of quote characters
is evaluated as if there were no quote characters at all.  Wildcards in this "q" parameter
are not supported.
  
  === qf (Query Fields) ===
  
- List of fields and the "boosts" to associate with each of them when building !DisjunctionMaxQueries
from the users query.  The format supported is {{{fieldOne^2.3 fieldTwo fieldThree^0.4}}}
which indicates that fieldOne has a boost of 2.3, fieldTwo has the default boost, and fieldThree
has a boost of 0.4 ... this indicates that matches in fieldOne are much more significant then
matches in fieldTwo, which are more significant then matches in fieldThree.
+ List of fields and the "boosts" to associate with each of them when building !DisjunctionMaxQueries
from the user's query.  The format supported is {{{fieldOne^2.3 fieldTwo fieldThree^0.4}}},
which indicates that fieldOne has a boost of 2.3, fieldTwo has the default boost, and fieldThree
has a boost of 0.4 ... this indicates that matches in fieldOne are much more significant than
matches in fieldTwo, which are more significant than matches in fieldThree.
  
  
  === mm (Minimum 'Should' Match) ===
  
- When dealing with queries there are 3 types of "clauses" that Lucene knows about: mandatory,
prohibited, and 'optional' (aka: "SHOULD")  By default all words or phrases specified in the
"q" param are treated as "optional" clauses unless they are proceeded by a "+" or a "-". But
when dealing with these "optional" clauses, the "mm" option makes it possible to say that
a certain minimum number of those clauses must match.  Specifying this minimum number can
be done in complex ways, equating to ideas like...
+ When dealing with queries there are 3 types of "clauses" that Lucene knows about: mandatory,
prohibited, and 'optional' (aka: "SHOULD")  By default all words or phrases specified in the
"q" param are treated as "optional" clauses unless they are preceeded by a "+" or a "-". 
 When dealing with these "optional" clauses, the "mm" option makes it possible to say that
a certain minimum number of those clauses must match (mm).  Specifying this minimum number
can be done in complex ways, equating to ideas like...
  
      * At least 2 of the optional clauses must match, regardless of how many clauses there
are: "{{{2}}}"
      * At least 75% of the optional clauses must match, rounded down: "{{{75%}}}"
-     * If there are less then 3 optional clauses, they all must match; if their are 3 or
more, then 75% must match, rounded up: "{{{2<-25%}}}"
+     * If there are less then 3 optional clauses, they all must match; if there are 3 or
more, then 75% must match, rounded up: "{{{2<-25%}}}"
-     * If there are less then 3 optional clauses, they all must match; for 3 to 5 clauses,
one less then the number of clauses must match, for 6 or more clauses, 80% must match, rounded
down:  "{{{2<-1 5<80%}}}"
+     * If there are less then 3 optional clauses, they all must match; for 3 to 5 clauses,
one less than the number of clauses must match, for 6 or more clauses, 80% must match, rounded
down:  "{{{2<-1 5<80%}}}"
  
  Full details on the variety of complex expressions supported are explained in detail [http://lucene.apache.org/solr/api/org/apache/solr/util/doc-files/min-should-match.html
here].
  
@@ -43, +43 @@

  
  === pf (Phrase Fields) ===
  
- Once the list of matching documents has been identified using the "fq" and "qf" params,
the "pf" param can be used to "boost" to the score of a documents in cases where all of the
terms in the "q" param appear in close proximity.
+ Once the list of matching documents has been identified using the "fq" and "qf" params,
the "pf" param can be used to "boost" the score of documents in cases where all of the terms
in the "q" param appear in close proximity.
  
- The format is the same as the "qf" param: a list of fields and the "boosts" to associate
with each of them when making phrase queries out of the entire "q" param.
+ The format is the same as the "qf" param: a list of fields and "boosts" to associate with
each of them when making phrase queries out of the entire "q" param.
  
  === ps (Phrase Slop) ===
  
@@ -59, +59 @@

  
  Float value to use as tiebreaker in !DisjunctionMaxQueries (should be something much less
then 1)
  
- When a term from the users input is tested against multiple fields, More then one may match
and each field will generate a different score based on how common that word is in that field
(for each document relative all other documents).  The Score will The "tie" param let's you
configure how much the final score of the query will be influenced by the scores of the lower
scoring fields compared to the highest scoring field.
+ When a term from the users input is tested against multiple fields, more than one field
may match and each field will generate a different score based on how common that word is
in that field (for each document relative to all other documents).  The "tie" param let's
you configure how much the final score of the query will be influenced by the scores of the
lower scoring fields compared to the highest scoring field.
  
- A value of "0.0" makes the query a pure "disjunction max query" -- only the maximum scoring
sub query contributes to the final score.  A value of "1.0" makes the query a pure "disjunction
sum query" where it doesn't matter what the maximum scoring sub query is, the final score
is hte sum of the sub scores.  Typically a low value (ie: 0.1) is useful.
+ A value of "0.0" makes the query a pure "disjunction max query" -- only the maximum scoring
sub query contributes to the final score.  A value of "1.0" makes the query a pure "disjunction
sum query" where it doesn't matter what the maximum scoring sub query is, the final score
is the sum of the sub scores.  Typically a low value (ie: 0.1) is useful.
  
  
  === bq (Boost Query) ===
  
- A raw query string (in the SolrQuerySyntax) that will be included with the users query to
influence the score.  If this is a !BooleanQuery with a default boost (1.0f) then the individual
clauses will be added directly to the main query. Otherwise the query will be included as
is.
+ A raw query string (in the SolrQuerySyntax) that will be included with the user's query
to influence the score.  If this is a !BooleanQuery with a default boost (1.0f) then the individual
clauses will be added directly to the main query. Otherwise, the query will be included as
is.
  
  /!\ :TODO: /!\ is the part about !BooleanQueries with boost of 1 still true?
  
  === bf (Boost Functions) ===
  
- [:FunctionQuery: Functions] (with optional boosts) that will be included in the users query
to influence the score.  Any function supported natively by Solr can be used, along with a
boost value, ie: recip(rord(myfield),1,2,3)^1.5
+ [:FunctionQuery: Functions] (with optional boosts) that will be included in the user's query
to influence the score.  Any function supported natively by Solr can be used, along with a
boost value, e.g.: recip(rord(myfield),1,2,3)^1.5
  
  Specifying functions with the "bf" param is just shorthand for using the {{{_val_:"...function..."}}}
syntax in a "bq" param.
  
@@ -97, +97 @@

  which can be overridden in the URL...
    http://localhost:8983/solr/select/?q=video&qt=dismax&fl=*,score
  
- You can also override which fields are searched on, and how much boost
+ You can also override which fields are searched on and how much boost
  each field gets...
    http://localhost:8983/solr/select/?q=video&qt=dismax&qf=features^20.0+text^0.3
  
@@ -107, +107 @@

    http://localhost:8983/solr/select/?q=video&qt=dismax&fl=name,score,inStock
    http://localhost:8983/solr/select/?q=video&qt=instock&fl=name,score,inStock
  
- One of the other really cool features in this handler, is robust
+ One of the other really cool features in this handler is robust
  support for specifying the "BooleanQuery.minimumNumberShouldMatch" you
- want to be used based on how many terms are in your users query.
+ want to be used based on how many terms are in your user's query.
  These allows flexibility for typos and partial matches.  For the
  dismax handler, 1 and 2 word queries require that all of the optional
  clauses match, but for 3-5 word queries one missing word is allowed...

Mime
View raw message