Github user njayaram2 commented on a diff in the pull request:
https://github.com/apache/madlib/pull/225#discussion_r163653896
--- Diff: src/ports/postgres/modules/knn/knn.sql_in ---
@@ -326,6 +331,39 @@ Result, with neighbors sorted from closest to furthest:
(6 rows)
</pre>
+
+-# Run KNN for classification using the
+weighted average:
+<pre class="example">
+DROP TABLE IF EXISTS knn_result_classification;
+SELECT * FROM madlib.knn(
+ 'knn_train_data', -- Table of training data
+ 'data', -- Col name of training data
+ 'id', -- Col name of id in train data
+ 'label', -- Training labels
+ 'knn_test_data', -- Table of test data
+ 'data', -- Col name of test data
+ 'id', -- Col name of id in test data
+ 'knn_result_classification', -- Output table
+ 3, -- Number of nearest neighbors
+ True, -- True to list nearest-neighbors by id
+ 'madlib.squared_dist_norm2', -- Distance function
+ True -- For weighted average
+ );
+SELECT * FROM knn_result_classification ORDER BY id;
+</pre>
+<pre class="result">
+ id | data | prediction | k_nearest_neighbours
+----+---------+---------------------+----------------------
+ 1 | {2,1} | 2.2 | {2,1,3}
+ 2 | {2,6} | 0.425 | {5,4,3}
+ 3 | {15,40} | 0.0174339622641509 | {7,6,5}
+ 4 | {12,1} | 0.0379633360193392 | {4,5,3}
+ 5 | {2,90} | 0.00306428140577315 | {9,6,7}
+ 6 | {50,45} | 0.00214165229166379 | {6,7,8}
--- End diff --
In continuation to the comment made above regarding `pred_out` for classification with
weighted averaging: the prediction in this example should not be those numbers, but rather
either `1` or `0`.
---
|