hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Krogen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13804) MutableStat mean loses accuracy if add(long, long) is used
Date Mon, 07 Nov 2016 23:07:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Erik Krogen updated HADOOP-13804:
    Attachment: HADOOP-13804.000.patch

> MutableStat mean loses accuracy if add(long, long) is used
> ----------------------------------------------------------
>                 Key: HADOOP-13804
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13804
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 2.6.5
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>            Priority: Minor
>         Attachments: HADOOP-13804.000.patch
> Currently if the {{MutableStat.add(long numSamples, long sum)}} method is used with a
large sample count, the mean that is returned will be very inaccurate. This is a result of
using the Welford method for variance calculation, which assumes that each sample is processed
on its own, to calculate the mean as well. For variance this is fine, since variance numbers
lose meaning if you add many samples at once, but the mean should still be accurate. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message