commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Gant <john.g...@gmail.com>
Subject Re: [math] Re: commons math
Date Mon, 22 Aug 2005 23:02:42 GMT
 What exactly does "column-wise" mean.  This just looks like Pearson's
> R, which is already available in the SimpleRegression class.  Do you
> mean generation of correlation matrices?

Sorry, I should have been more specific. This will allow someone to
calculate the pearson r coefficient between column vectors. This
results in a correlation matrix with dimensions (c * c), where c is
the number of columns in the raw data matrix.

> > Distance measures, are basically a numeric way of classifying a
> > relationship between two numerical or categorical datasets. Usually
> > distance measures are used in conjunction with k-means, or
> > hierarchical clustering (or some type of clustering algorithm).

> Are these essentially metrics on R^n (the "numerical" case) or
> homogeneity measures (e.g. chi-square, for the categorical case)?

The numerical distance measures can either be something as simple as
euclidean distance, or correlation cofficient. The categorical
measures are more logical (less numerical), and something like hamming
distance could be used. Does this answer your question?
  
> If a clustering algorithm can use mutlitple different distance
> measures, then it does make sense to encapsulate the distance measure.
>  Defining a distance measure or metric interface and then defining
> implementation classes that implement that interface and having the
> clustering algorithms have instances of these as members is a
> reasonable way to do this, IMHO.


A clustering algorithm is usually independent of the distance measure,
but relies on this measure to identify clusters. All clustering
algorithms (that I have experience with) use distance measures, and I
plan on setting up the implementation so that it is similar to the
contract of Collections.sort(). I have generated an interface,
DistanceMeasure, which has only a method calculateDistance(). This
interface, currently, is implemented in the EulcideanDistance class. I
have not posted this code, and need to finish the unit tests.

Thanks,
John

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message