lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "DisMax" by JanHoydahl
Date Wed, 16 Feb 2011 23:06:57 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "DisMax" page has been changed by JanHoydahl.
The comment on this change is: Rewrite more clearly the concept.
http://wiki.apache.org/solr/DisMax?action=diff&rev1=1&rev2=2

--------------------------------------------------

- The term “dismax” gets tossed around on the Solr lists frequently, which can be fairly
confusing to new users. It originated as a shorthand name for the DisMaxRequestHandler (which
was named after the !DisjunctionMaxQueryParser, which was named after the !DisjunctionMaxQuery
 class that it uses heavily). 
+ '''DisMax''' is an abbreviation Disjunction Max, and is a popular query mode with Solr.
  
- In recent years, the DisMaxRequestHandler and the StandardRequestHandler were both refactored
into a single SearchHandler class, and now the term “dismax” usually refers to the [[DisMaxQParserPlugin]].

+ Simply put, it's your choice for all user generated queries.
+ 
+ Out of the box, Solr uses the standard Solr query parser which is pretty stupid, understanding
only syntactically correct boolean queries like "title:foo OR body:foo", it can only search
one field by default, and it may very well throw an exception in your face if you put in some
characters it does not like.
+ 
+ Therefore a new, more robust query mode was needed and the [[DisMaxQParserPlugin|DisMax
Query Parser]] was born. It is designed to process simple user entered phrases (without heavy
syntax) and search for the individual words across several fields using different weighting
(boosts) based on the significance of each field, and it should never throw an exception.
+ 
+ '''Disjunction''' refers to the fact that your search is executed across multiple fields,
e.g. title, body and keywords, with different relevance weights
+ 
+ '''Max''' means that if your word "foo" matches both title and body, the max score of these
two (probably title match) is added to the score, not the sum of the two as a simple OR query
would do. This gives more control over your ranking.
+ 
+ !DisMax is usually the short name for the actual query parser, the [[DisMaxQParserPlugin]]
  
  There is a Lucid Imagination Blog post that explains the [[http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/|origins
and conceptual behavior]] of dismax functionality.
  

Mime
View raw message