mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAHOUT-1746) Fix: mxA ^ 2, mxA ^ 0.5 to mean the same thing as mxA * mxA and mxA ::= sqrt _
Date Wed, 24 Jun 2015 22:17:06 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600264#comment-14600264
] 

ASF GitHub Bot commented on MAHOUT-1746:
----------------------------------------

Github user dlyubimov commented on a diff in the pull request:

    https://github.com/apache/mahout/pull/145#discussion_r33203413
  
    --- Diff: math-scala/src/main/scala/org/apache/mahout/math/drm/package.scala ---
    @@ -160,6 +160,165 @@ package object drm {
       def dsqrt[K: ClassTag](drmA: DrmLike[K]): DrmLike[K] = new OpAewUnaryFunc[K](drmA,
math.sqrt)
     
       def dsignum[K: ClassTag](drmA: DrmLike[K]): DrmLike[K] = new OpAewUnaryFunc[K](drmA,
math.signum)
    +  
    +  ///////////////////////////////////////////////////////////
    +  // Misc. math utilities.
    +
    +  /**
    +   * Compute column wise means and variances -- distributed version.
    +   *
    +   * @param drmA Note: will pin input to cache if not yet pinned.
    +   * @tparam K
    +   * @return colMeans → colVariances
    +   */
    +  def dcolMeanVars[K: ClassTag](drmA: DrmLike[K]): (Vector, Vector) = {
    +
    +    import RLikeDrmOps._
    +
    +    val drmAcp = drmA.checkpoint()
    +    
    +    val mu = drmAcp colMeans
    +
    +    // Compute variance using mean(x^2) - mean(x)^2
    +    val variances = (drmAcp ^ 2 colMeans) -=: mu * mu
    +
    +    mu → variances
    --- End diff --
    
    Other than it is valid Scala style? no. 
    
    btw every tool i have shows it correctly for me, including less, web pages, latex/lyx,
etc. etc. Use intellij. really.



> Fix: mxA ^ 2, mxA ^ 0.5 to mean the same thing as mxA * mxA and mxA ::= sqrt _
> ------------------------------------------------------------------------------
>
>                 Key: MAHOUT-1746
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1746
>             Project: Mahout
>          Issue Type: Blog - New Blog Request
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>             Fix For: 0.10.2
>
>
> it so happens that in java, if x is of double type, Math.pow(x,2.0) and x * x produce
different values approximately once in million random values.
> This is extremely annoying as it creates rounding errors, especially with things like
euclidean distance computations, which eventually may produce occasional NaNs. 
> This issue suggests to get special treatment on vector and matrix dsl to make sure identical
fpu algorithms are running as follows:
> x ^ 2 <=> x * x
> x ^ 0.5 <=> sqrt(x)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message