lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Lucene-java Wiki] Update of "SummerOfCode2011ProjectRankingNotes" by DavidNemeskey
Date Wed, 06 Jul 2011 08:28:09 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The "SummerOfCode2011ProjectRankingNotes" page has been changed by DavidNemeskey:

    * `score + boost`: I do not consider this a boost, but rather a sum of similarity scores,
of which one happens to come from outside (e.g. PageRank)
    * `score * boost`
    * `score = tf(boost * freq) * idf`
+  * Language modeling would require custom aggregation of query terms
+   * product instead of weighted sum (this could be solved by using log, but the query norm
still messes it up)
+   * decide which documents have a term, and which do not, because we have to weight them
accordingly (p_t or 1 - p_t)
+   * two types of aggregation?
+    * per field (definitely Similarity-specific)
+    * whole query (should be Similarity-specific too, but might be OK if fixed)
  === Questions about Lucene ===
   * Is it possible to design a scoring interface that is consistent across ranking frameworks?

View raw message