commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Neidhart (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MATH-917) More distance measurements are needed in o.a.c.m.stat.clustering.
Date Sat, 23 Mar 2013 15:29:15 GMT

     [ https://issues.apache.org/jira/browse/MATH-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Neidhart updated MATH-917:
---------------------------------

    Attachment: clustering.zip

I have attached the result of some refactoring of the cluster package.

It includes the following changes/additions:

 * Move distance calculation from the Clusterable interface to a dedicated DistanceMeasure
interface with a first concrete implementation: EuclideanDistance
 * Modify a Cluster to make it more general: remove center as this is only used for centroid
based clustering algos
 * Introduce a Clusterer interface with currently only one method: cluster(Collection<Clusterable>),
but we may add more, e.g. with a maxIterations argument
 * Added an AbstractClusterer class which provides basic stuff for each clusterer, e.g. a
distance measure.
 * The existing clustering algos implement the new interface via the abstract class

I like the Clusterable interface, as it makes it quite easy to extend existing data objects
to make them an input for the clusterer. The simple *Point implementations have been kept
but I am not fully happy with the name.

I would like to get feedback if this goes in the right direction, and if so, will finish the
contribution.
                
> More distance measurements are needed in o.a.c.m.stat.clustering.
> -----------------------------------------------------------------
>
>                 Key: MATH-917
>                 URL: https://issues.apache.org/jira/browse/MATH-917
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Reid Hochstedler
>             Fix For: 4.0
>
>         Attachments: clustering.zip
>
>
> Currently only Euclidean distance is used for distance measurement, it would be easy
to quickly add Manhattan and Chebyshev distance among others.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message