commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-817) Multivariate Normal Mixture Model Fitting by Expectation Maximization
Date Mon, 12 Nov 2012 22:59:12 GMT

    [ https://issues.apache.org/jira/browse/MATH-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495728#comment-13495728
] 

Gilles commented on MATH-817:
-----------------------------

bq. [...] You aren't required to specify full mixture components during initialization.

Isn't this syntactic sugar? The parameters _are_ needed for the computation to take place;
IIUC, you provide (in method "createMultivariateNormalForInitialization") a way to create
the missing data.
IMHO, it would be better, design-wise, to make that method "public", with a descriptive name
like:
{code}
public static MultivariateNormalDistribution
estimateMultivariateNormalDistribution(double[][] data) {
  // ...
}
{code}
The API would then require the initial parameters to be specified, either independently or
through the provided estimator (at the user's choice).
Similarly, if the weights are not specified, you assign random values under the hood ("initRandomComponentWeights").
I think that it is better to make it explicit that initial values are needed, but that random
values can be used. You could then provide another syntactic sugar method such as:
{code}
public static List<Pair<Double, MultivariateNormalDistribution>>
makeRandomMultivariateNormalDistributionMixture(List<MultivariateNormalDistribution>
components) {
  // Assign a random weight to each distribution in "components".
}
{code}
Or even combine both methods:
{code}
public static List<Pair<Double, MultivariateNormalDistribution>>
estimateRandomMultivariateNormalDistributionMixture(double[][] data,
                                                    int numComponents) {
  // Estimate and assign random weights.
}
{code}

IIUC, this approach would avoid the many constructors with disparate parameters than can be
null and, if not null, must be checked for consistency (number of dimensions), and the various
code paths that depend on whether some argument is null or not.

                
> Multivariate Normal Mixture Model Fitting by Expectation Maximization
> ---------------------------------------------------------------------
>
>                 Key: MATH-817
>                 URL: https://issues.apache.org/jira/browse/MATH-817
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Jared Becksfort
>            Priority: Minor
>         Attachments: AbstractMultivariateRealDistribution.java.patch, MixtureMultivariateRealDistribution.java.patch,
MultivariateNormalDistribution.java.patch, MultivariateNormalMixtureExpectationMaximizationFitter.java,
MultivariateNormalMixtureExpectationMaximizationFitterTest.java
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> I will submit a class for fitting Multivariate Normal Mixture Models using Expectation
Maximization.
> > Hello,
> >
> > I have implemented some classes for multivariate Normal distributions, multivariate
normal mixture models, and an expectation maximization fitting class for the mixture model.
 I would like to submit it to Apache Commons Math.  I still have some touching up to do so
that they fit the style guidelines and implement the correct interfaces.  Before I do so,
I thought I would at least ask if the developers of the project are interested in me submitting
them.
> >
> > Thanks,
> > Jared Becksfort
> Dear Jared,
> Yes, that would be very nice to have such an addition! Remember to also include unit
tests (refer to the current ones for examples). The best would be to split a submission up
into multiple minor ones, each covering a natural submission (e.g. multivariate Normal distribution
in one submission), and create an issue as described at http://commons.apache.org/math/issue-tracking.html
.
> If you run into any problems, please do not hesitate to ask on this mailing list.
> Cheers, Mikkel.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message