madlib-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From njayaram2 <...@git.apache.org>
Subject [GitHub] madlib pull request #225: Added option for weighted average for both classif...
Date Wed, 24 Jan 2018 19:48:31 GMT
Github user njayaram2 commented on a diff in the pull request:

    https://github.com/apache/madlib/pull/225#discussion_r163653896
  
    --- Diff: src/ports/postgres/modules/knn/knn.sql_in ---
    @@ -326,6 +331,39 @@ Result, with neighbors sorted from closest to furthest:
     (6 rows)
     </pre>
     
    +
    +-#   Run KNN for classification using the 
    +weighted average:
    +<pre class="example">
    +DROP TABLE IF EXISTS knn_result_classification;
    +SELECT * FROM madlib.knn(
    +                'knn_train_data',      -- Table of training data
    +                'data',                -- Col name of training data
    +                'id',                  -- Col name of id in train data
    +                'label',               -- Training labels
    +                'knn_test_data',       -- Table of test data
    +                'data',                -- Col name of test data
    +                'id',                  -- Col name of id in test data
    +                'knn_result_classification',  -- Output table
    +                 3,                    -- Number of nearest neighbors
    +                 True,                 -- True to list nearest-neighbors by id
    +                 'madlib.squared_dist_norm2', -- Distance function
    +                 True                 -- For weighted average
    +                );
    +SELECT * FROM knn_result_classification ORDER BY id;
    +</pre>
    +<pre class="result">
    + id |  data   |     prediction      | k_nearest_neighbours 
    +----+---------+---------------------+----------------------
    +  1 | {2,1}   |                 2.2 | {2,1,3}
    +  2 | {2,6}   |               0.425 | {5,4,3}
    +  3 | {15,40} |  0.0174339622641509 | {7,6,5}
    +  4 | {12,1}  |  0.0379633360193392 | {4,5,3}
    +  5 | {2,90}  | 0.00306428140577315 | {9,6,7}
    +  6 | {50,45} | 0.00214165229166379 | {6,7,8}
    --- End diff --
    
    In continuation to the comment made above regarding `pred_out` for classification with
weighted averaging: the prediction in this example should not be those numbers, but rather
either `1` or `0`.


---

Mime
View raw message