mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Djellel Eddine Difallah (JIRA)" <>
Subject [jira] Updated: (MAHOUT-551) Kmeans example with space delimited data
Date Sun, 21 Nov 2010 21:09:17 GMT


Djellel Eddine Difallah updated MAHOUT-551:

    Status: Patch Available  (was: Open)

> Kmeans example with space delimited data
> ----------------------------------------
>                 Key: MAHOUT-551
>                 URL:
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Utils
>    Affects Versions: 0.4
>            Reporter: Djellel Eddine Difallah
>            Priority: Minor
>         Attachments: MAHOUT-551.patch
> The provided example for Kmeans clustering using the synthetic control data asks for
t1 and t2 measures because it runs the Canopy Driver to determine the initial clusters. Kmeans
originally requires a K variable to generate random centers from the input data. I propose
to add another example in the package which will serve for any space delimited numerical input
to cluster with Kmeans in its original form and not using Canopy. The modification is quite
simple and is mostly based on the synthetic control Job.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message