hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Krogen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-13804) MutableStat mean loses accuracy if add(long, long) is used
Date Mon, 07 Nov 2016 22:49:59 GMT
Erik Krogen created HADOOP-13804:
------------------------------------

             Summary: MutableStat mean loses accuracy if add(long, long) is used
                 Key: HADOOP-13804
                 URL: https://issues.apache.org/jira/browse/HADOOP-13804
             Project: Hadoop Common
          Issue Type: Bug
          Components: metrics
    Affects Versions: 2.6.5
            Reporter: Erik Krogen
            Assignee: Erik Krogen
            Priority: Minor


Currently if the {{MutableStat.add(long numSamples, long sum)}} method is used with a large
sample count, the mean that is returned will be very inaccurate. This is a result of using
the Welford method for variance calculation, which assumes that each sample is processed on
its own, to calculate the mean as well. For variance this is fine, since variance numbers
lose meaning if you add many samples at once, but the mean should still be accurate. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message