lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravindra Sharma" <ravish....@gmail.com>
Subject Somewhat complex scoring/boosting
Date Fri, 05 Sep 2008 22:44:30 GMT
Hi Folks,

I have somewhat complex scoring/boosting requirement.

Say I have 3 text fields A, B, C and a Numeric field called D.
Say My query is "testrank".

Scoring should be based on following:

Query matches
1. text fields A, B and C, & Highest value of D (highest boost/rank)
2. A and B, & Highest value of D (2nd highest)
3. A and C, & Highest value of D (3rd highest)
4. B and C, & Highest value of D (4th highest)
5. B, & Highest value of D (5th highest)
6. C, & Highest value of D (6th highest)

i). If I use the standard query, it will be query (with boost) something
like this:

query = (A:testrank AND B:testrank AND C:testrank)^10 OR (A:testrank AND
B:testrank)^9 OR (A:testrank AND C:testrank)^8 OR (B:testrank AND
C:testrank)^7 OR (A:testrank)^6 OR (B:testrank)^5 OR (C:testrank)^4
sort = by Score (primary), Field D (Secondary)

Also, I do need to override Similarity such that tf, idf etc doesn't
interfere; and all docs should score purely based on boost values, I have
specified. That way seconday sort can be effective.

This will be a poor query so I would like to avoid it.

ii). I have never used DisjunctionMaxQuery (or Solr qt=DisMax) and at first
glance, it appeared just like what I need (with tiebreaker = 0). However it
is not. If I understand it correctly #1, #2 and #3 will score equally
because it just score based on highest boost (which is A for #1, #2 and #3).

This will not work.

iii) Wondering, Do I have to write a custom Query (custom score) like
DisjunctionMaxQuery which scores based on sum of matching fields instead of
just taking highest. Wondering, if I could override the scoring of
DisjustionMaxQuery such that it takes sum of scores from sub-queries.

If anyone has any clever suggestion, I will really really appreciate.

Thanks,
Ravi

P.S. - Posted the same question on solr-user as well.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message