mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Smith <jasonparal...@gmail.com>
Subject PearsonCorrelationSimilarity returning NaN for user similarity with perfect match
Date Thu, 02 Jun 2011 03:00:53 GMT
What is the reasoning behind PearsonCorrelationSimilarity  returning
NaN for userSimilarity when the two user's overlapping reviews match
up perfectly?
In my case of a limited set of rating values (1 to 5 stars) it seems
quite possible that a user with a smaller number of ratings might have
overlapping ratings with other users.  Am I missing something here.

 // Note that sum of X and sum of Y don't appear here since they are
assumed to be 0;
    // the data is assumed to be centered.
    double denominator = Math.sqrt(sumX2) * Math.sqrt(sumY2);
    if (denominator == 0.0) {
      // One or both parties has -all- the same ratings;
      // can't really say much similarity under this measure
      return Double.NaN;
    }
    return sumXY / denominator;

Mime
View raw message