lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gagandeep singh <gagan.g...@gmail.com>
Subject New implementation of MLT
Date Sun, 31 Mar 2013 05:18:30 GMT
Hi folks

We started using the default implementation of MLT
(org.apache.solr.handler.MoreLikeThisHandler) recently and found that there
are a couple of things it lacks:

   1. Searching for terms in the same field as the original document:
      - the current implementation picks the top field to search an
      interesting term in based on docFreq, however this can give bad
results if
      say original product is from brand:"RED Valentino", and we end
up searching
      red in color field.
   2. Phrase boosts:
      - if product name is "business cards", then it makes sense to give a
      boost to the phrase boost to products which are also business cards.
   3. Support for bq, bf, fq, multiplicative boost:
      - you might want to filter out_of_stock products, give a
      multiplicative boost to a product based on their price
similarity / launch
      date.
   4. Support of explainOther

We had a use case for each of these and i ended up writing my own
MLTQueryParser which builds the MLT query for a given document. It also has
a new concept called childDocs. You can think of some documents as
products, and a collection of products can be though of as a category page.
You could search for similar documents based on the products a category
page has.

I was wondering if you guys would be interested in an alternate
implementation of MLT that supports all the knobs that solr search does. I
could post a patch file maybe?

Thanks
Gagan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message