flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1737) Add statistical whitening transformation to machine learning library
Date Fri, 11 Sep 2015 09:00:51 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14740439#comment-14740439
] 

ASF GitHub Bot commented on FLINK-1737:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1078#discussion_r39253000
  
    --- Diff: flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/math/SparseVector.scala
---
    @@ -85,6 +85,34 @@ case class SparseVector(
         }
       }
     
    +  /** Returns the outer product (a.k.a. Kronecker product) of `this`
    +    * with `other`. The result is given in [[org.apache.flink.ml.math.SparseMatrix]]
    +    * representation.
    +    *
    +    * @param other a Vector
    +    * @return the [[org.apache.flink.ml.math.SparseMatrix]] which equals the outer product
of `this`
    +    *         with `other.`
    +    */
    +  override def outer(other: Vector): SparseMatrix = {
    +    val numRows = size
    +    val numCols = other.size
    +
    +    val otherIndices = other match {
    +      case sv @ SparseVector(_, _, _) => sv.indices
    +      case dv @ DenseVector(_) => (0 until dv.size).toArray
    +    }
    +
    +    val entries = for {
    +      i <- indices
    +      j <- otherIndices
    +      value = this(i) * other(j)
    --- End diff --
    
    It might make sense to directly operate on the `SparseVector's` data array because every
`apply` call entails a binary search and, thus, having a complexity of `O(log n)`. The same
holds true for the `other` vector if it is a `SparseVector`.


> Add statistical whitening transformation to machine learning library
> --------------------------------------------------------------------
>
>                 Key: FLINK-1737
>                 URL: https://issues.apache.org/jira/browse/FLINK-1737
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>            Assignee: Daniel Pape
>              Labels: ML, Starter
>
> The statistical whitening transformation [1] is a preprocessing step for different ML
algorithms. It decorrelates the individual dimensions and sets its variance to 1.
> Statistical whitening should be implemented as a {{Transfomer}}.
> Resources:
> [1] [http://en.wikipedia.org/wiki/Whitening_transformation]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message