lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Yee <b...@wayfair.com>
Subject LTR original score feature
Date Fri, 12 Jan 2018 19:52:51 GMT
I wanted to get some opinions on using the original score feature. The original score produced
by Solr is intuitively a very important feature. In my data set I'm seeing that the original
score varies wildly between different queries. This makes sense since the score generated
by Solr is not normalized across all queries. However, won't this mess with our training data?
If this feature is 3269.4 for the top result for one query, and then 32.7 for the top result
for another query, it does not mean that the first document was 10x more relevant to its query
than the second document. I am using a normalize param within Ranklib, but that only normalizes
features between each other, not within one feature, right? How are people handling this?
Am I missing something?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message