lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "MoreLikeThis" by AaronDaubman
Date Thu, 28 Jun 2012 15:04:58 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "MoreLikeThis" page has been changed by AaronDaubman:
http://wiki.apache.org/solr/MoreLikeThis?action=diff&rev1=16&rev2=17

Comment:
Added defaults  - very useful for those trying to figure out why they are getting so few terms

  !MoreLikeThis constructs a lucene query based on terms within a document.  For best results,
use stored !TermVectors in the schema.xml for fields you will use for similarity. {{{
   <field name="cat" ... termVectors="true" />
  }}}
- If termVectors are not stored, !MoreLikeThis will generate terms from stored fields.  
+ If termVectors are not stored, !MoreLikeThis will generate terms from stored fields.
  
  
  == Common Parameters ==
  
- || '''param''' || '''description''' ||
+ || '''param''' || '''description''' || '''defaults (from 3.6.0 MoreLikeThis.java)''' ||
- || mlt.fl      || The fields to use for similarity.  NOTE: if possible, these should have
a stored TermVector ||
+ || mlt.fl      || The fields to use for similarity.  NOTE: if possible, these should have
a stored TermVector || DEFAULT_FIELD_NAMES = new String[] {"contents"} ||
- || mlt.mintf   || Minimum Term Frequency - the frequency below which terms will be ignored
in the source doc. ||
+ || mlt.mintf   || Minimum Term Frequency - the frequency below which terms will be ignored
in the source doc. || DEFAULT_MIN_TERM_FREQ = 2 ||
- || mlt.mindf   || Minimum Document Frequency - the frequency at which words will be ignored
which do not occur in at least this many docs. ||
+ || mlt.mindf   || Minimum Document Frequency - the frequency at which words will be ignored
which do not occur in at least this many docs. || DEFAULT_MIN_DOC_FREQ = 5 ||
- || mlt.minwl   || minimum word length below which words will be ignored. ||
+ || mlt.minwl   || minimum word length below which words will be ignored. || DEFAULT_MIN_WORD_LENGTH
= 0 ||
- || mlt.maxwl   || maximum word length above which words will be ignored. ||
+ || mlt.maxwl   || maximum word length above which words will be ignored. || DEFAULT_MAX_WORD_LENGTH
= 0 ||
- || mlt.maxqt   || maximum number of query terms that will be included in any generated query.
 ||
+ || mlt.maxqt   || maximum number of query terms that will be included in any generated query.
 || DEFAULT_MAX_QUERY_TERMS = 25 ||
- || mlt.maxntp  || maximum number of tokens to parse in each example doc field that is not
stored with TermVector support.  ||
+ || mlt.maxntp  || maximum number of tokens to parse in each example doc field that is not
stored with TermVector support.  || DEFAULT_MAX_NUM_TOKENS_PARSED = 5000 ||
- || mlt.boost   || [true/false] set if the query will be boosted by the interesting term
relevance. ||
+ || mlt.boost   || [true/false] set if the query will be boosted by the interesting term
relevance. || DEFAULT_BOOST = false ||
- || mlt.qf      || Query fields and their boosts using the same format as that used in [[DisMaxQParserPlugin]].
 These fields must also be specified in mlt.fl. ||
+ || mlt.qf      || Query fields and their boosts using the same format as that used in [[DisMaxQParserPlugin]].
 These fields must also be specified in mlt.fl. || ||
  
  == MoreLikeThisComponent ==
  
@@ -43, +43 @@

  
  When you specifically want information about similar documents, you can use the MoreLikeThisHandler.
  
- If you want to filter the similar results given by MoreLikeThis you have to use the MoreLikeThisHandler.
It will consider the similar document result set as the main one so will apply the specified
filters (fq) on it. If you use the !MoreLikeThisComponent and apply query filters it will
be applyed to the result set returned by the main query (!QueryComponent) and not to the one
returned by the !MoreLikeThisComponent. 
+ If you want to filter the similar results given by MoreLikeThis you have to use the MoreLikeThisHandler.
It will consider the similar document result set as the main one so will apply the specified
filters (fq) on it. If you use the !MoreLikeThisComponent and apply query filters it will
be applyed to the result set returned by the main query (!QueryComponent) and not to the one
returned by the !MoreLikeThisComponent.
  

Mime
View raw message