spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sro...@apache.org
Subject spark git commit: [DOCS] Fixed NDCG formula issues
Date Mon, 20 Aug 2018 19:59:25 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-2.3 ea01e362f -> 9702bb637


[DOCS] Fixed NDCG formula issues

When j is 0, log(j+1) will be 0, and this leads to division by 0 issue.

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Please review http://spark.apache.org/contributing.html before opening a pull request.

Closes #22090 from yueguoguo/patch-1.

Authored-by: Zhang Le <yueguoguo@users.noreply.github.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
(cherry picked from commit 219ed7b487c2dfb5007247f77ebf1b3cc73cecb5)
Signed-off-by: Sean Owen <sean.owen@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9702bb63
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9702bb63
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9702bb63

Branch: refs/heads/branch-2.3
Commit: 9702bb637d5ac665fefaa96cc69c5f92553f613a
Parents: ea01e36
Author: Zhang Le <yueguoguo@users.noreply.github.com>
Authored: Mon Aug 20 14:59:03 2018 -0500
Committer: Sean Owen <sean.owen@databricks.com>
Committed: Mon Aug 20 14:59:21 2018 -0500

----------------------------------------------------------------------
 docs/mllib-evaluation-metrics.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/9702bb63/docs/mllib-evaluation-metrics.md
----------------------------------------------------------------------
diff --git a/docs/mllib-evaluation-metrics.md b/docs/mllib-evaluation-metrics.md
index 7f27754..ac398fb 100644
--- a/docs/mllib-evaluation-metrics.md
+++ b/docs/mllib-evaluation-metrics.md
@@ -462,13 +462,13 @@ $$rel_D(r) = \begin{cases}1 & \text{if $r \in D$}, \\ 0 & \text{otherwise}.\end{
       <td>Normalized Discounted Cumulative Gain</td>
       <td>
         $NDCG(k)=\frac{1}{M} \sum_{i=0}^{M-1} {\frac{1}{IDCG(D_i, k)}\sum_{j=0}^{n-1}
-          \frac{rel_{D_i}(R_i(j))}{\text{ln}(j+1)}} \\
+          \frac{rel_{D_i}(R_i(j))}{\text{ln}(j+2)}} \\
         \text{Where} \\
         \hspace{5 mm} n = \text{min}\left(\text{max}\left(|R_i|,|D_i|\right),k\right) \\
-        \hspace{5 mm} IDCG(D, k) = \sum_{j=0}^{\text{min}(\left|D\right|, k) - 1} \frac{1}{\text{ln}(j+1)}$
+        \hspace{5 mm} IDCG(D, k) = \sum_{j=0}^{\text{min}(\left|D\right|, k) - 1} \frac{1}{\text{ln}(j+2)}$
       </td>
       <td>
-        <a href="https://en.wikipedia.org/wiki/Information_retrieval#Discounted_cumulative_gain">NDCG
at k</a> is a
+        <a href="https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG">NDCG
at k</a> is a
         measure of how many of the first k recommended documents are in the set of true relevant
documents averaged
         across all users. In contrast to precision at k, this metric takes into account the
order of the recommendations
         (documents are assumed to be in order of decreasing relevance).


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message