commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikkel Meyer Andersen <m...@mikl.dk>
Subject Re: [math] Anyone using BetaDistribution and BetaDistributionImpl
Date Thu, 17 Mar 2011 14:36:09 GMT
2011/3/17 KARR, DAVID (ATTSI) <dk068x@att.com>:
>> -----Original Message-----
>> From: Mikkel Meyer Andersen [mailto:mikl@mikl.dk]
>> Sent: Thursday, March 17, 2011 3:22 AM
>> To: Commons Users List
>> Subject: Re: [math] Anyone using BetaDistribution and
>> BetaDistributionImpl
>>
>> Hi David,
>>
>> Yes, I am using the implementation of the beta distribution and am
>> quite
>> happy with it. Anything in particular you're thinking of that I can
>> help you
>> with?
>
> Well, primarily you could help me figure out how to use it :) , but
> that's more of an issue with not understanding the statistics, as
> opposed to not understanding the API.
I will love to help you with usage of the API. In regards to
statistics, I will be able to provide a confined amount of help since
this is for commercial usage. For thorough statistical consultancy, I
can provide assistance through my company (in that case, contact me
directly).
>
> We have a large collection of individual data records that indicate
> success/failure of an operation call, where there are a significant
> number of possible operations, along with other permutations in the
> record that associate it with a different "workflow".  We can get
> success/failure ratio of those operations in workflows over particular
> time periods (15, 60, 180, 1440 minutes, et cetera), but we want to
> process this data over a much larger time period (30 days or more) to
> determine what's "normal" for those operations in the various workflows,
> essentially building percentage ranges for each of those permutations
> that indicate whether a permutation is "green", "orange", or "red".
>
> I've been told by someone who understands the statistics only a little
> better than me that a beta distribution function could help here, but I
> won't be able to implement this until we get help from someone who
> really understands the statistics here.
If your outcome of each call is only either success or failure, the
number of successes in n calls is so-called Binomial distributed, and
inference of the probability parameter (success rate) can be made in
sevaral ways. The easiest is to make classical inference in the
Binomial distribution. Another option, involving the Beta
distribution, is to make a Bayesian analysis (the Beta serves as a
prior to the Binomial likelihood yielding a posterior Beta). So that's
probably where the Betas has been thought to come into play. If you
want to adjust for some covariates (such as the time of the day the
call happened), you can make Binomial regression, e.g. logistic or
probit regression. Doing regression you can test to see if the
covariates have influence on the outcome, e.g. if the calls happened
in the morning, the success rate is higher or whatever might be the
case.

Indicator colors such as green and orange can e.g. be assigned if the
probability rate is e.g. within 75% or 90% (confidence interval for
traditional or credible interval for Bayesian) of the normal, or red
else, whichever comes first, respectively.

Cheers, Mikkel.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message