hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13804) MutableStat mean loses accuracy if add(long, long) is used
Date Tue, 08 Nov 2016 00:40:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645977#comment-15645977
] 

Hudson commented on HADOOP-13804:
---------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10785 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10785/])
HADOOP-13804. MutableStat mean loses accuracy if add(long, long) is (zhz: rev 3dbad5d823b8bf61b643dd1057165044138b99e0)
* (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/util/SampleStat.java
* (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/lib/MutableStat.java
* (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/metrics2/lib/TestMutableMetrics.java


> MutableStat mean loses accuracy if add(long, long) is used
> ----------------------------------------------------------
>
>                 Key: HADOOP-13804
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13804
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 2.6.5
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>            Priority: Minor
>             Fix For: 2.8.0, 2.7.4, 3.0.0-alpha2
>
>         Attachments: HADOOP-13804.000.patch
>
>
> Currently if the {{MutableStat.add(long numSamples, long sum)}} method is used with a
large sample count, the mean that is returned will be very inaccurate. This is a result of
using the Welford method for variance calculation, which assumes that each sample is processed
on its own, to calculate the mean as well. For variance this is fine, since variance numbers
lose meaning if you add many samples at once, but the mean should still be accurate. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message