flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1745) Add exact k-nearest-neighbours algorithm to machine learning library
Date Tue, 08 Mar 2016 03:11:40 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184308#comment-15184308

ASF GitHub Bot commented on FLINK-1745:

Github user chiwanpark commented on the pull request:

    Hi @danielblazevski, thanks for update and sorry for late reply. I tried to test your
implementation and have found few things to do before merging this.
    First, It is about test cases. I think we should add a test case for KNN with quad-tree
rather than modifying a test case without quad-tree. Also we need some test cases with non-executable
configuration such as KNN with quad-tree and non-compatible distance metric. A method to create
a test case with exceptions is described in scalatest documentation (**Intercepted exceptions**
section in  http://www.scalatest.org/user_guide/using_assertions).
    Second, package definitions of `QuadTree` and `QuadTreeSuite` are not matched with directory
    Finally, I think we need to add more detail description with some mathematical background
of KNN and quad-tree (including link of your slides and papers which you referred to) to the
documentation. Also we need examples and  description of parameters with default value.
    About rebasing, if you set `apache/flink` as remote `apache`, you can apply commands I
suggested with renaming `upstream` to `apache`. You don't need to worry during rebasing. I
also have copied branch of your `FLINK-1745` branch in my local machine. If you have some
problems with rebasing, I'll rebase on `apache/master`.

> Add exact k-nearest-neighbours algorithm to machine learning library
> --------------------------------------------------------------------
>                 Key: FLINK-1745
>                 URL: https://issues.apache.org/jira/browse/FLINK-1745
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>            Assignee: Daniel Blazevski
>              Labels: ML, Starter
> Even though the k-nearest-neighbours (kNN) [1,2] algorithm is quite trivial it is still
used as a mean to classify data and to do regression. This issue focuses on the implementation
of an exact kNN (H-BNLJ, H-BRJ) algorithm as proposed in [2].
> Could be a starter task.
> Resources:
> [1] [http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm]
> [2] [https://www.cs.utah.edu/~lifeifei/papers/mrknnj.pdf]

This message was sent by Atlassian JIRA

View raw message