lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Judioo <cont...@judioo.com>
Subject Boost Strangeness
Date Sat, 18 Jun 2011 12:11:27 GMT
WONDERFUL!
Just reporting back.
This document is ACE

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

For explaining what the filters are and how to affect the analyzer.

Erik your statement "First, boosting isn't absolute"  played on me so
I continued to investigate boosting.

I found this document that ( at last ) explains the dismax logic

http://www.lucidimagination.com/blog/2010/05/23/whats-a-dismax/

The reason why I was not getting the order I require was due to:
A)  my boost metrics were too close together.
b) similar id's in a document affected the score


It seems that if a partial match is made the product ( a % of the
total boost ) contributes to the documents score.
This meant that one type of document in the index had a higher
aggregate score due to the fact it had all but one of the boosted
fields ( does not have parent_id ) in it and the fields where
populated with content that was *very* similar to the requested id.

for example

required id = b011mg62
X_id = b011mgsf

Due to the partial matching and closeness of the boost ranges this
type of document always aquired a higher score than another document
with just one matching field ( i.e. id field ).

My solution was to increase the value of the fields I wanted to *really* count

id^100000 parent_id^5000 brand_container_id^500 ....

As a result even if there are similar matches in any field the id and
parent_id matches should always receive a higher boost.


This was also useful
http://stackoverflow.com/questions/2179497/adding-date-boosting-to-complex-solr-queries


Thanks for the help!

Mime
View raw message