mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <>
Subject Re: clustering your data with dirichlet issue
Date Tue, 06 Apr 2010 17:39:29 GMT
What we really need is a nice utility to take clustered output and maybe
label all
of the vectors in the training set (and new vectors, if it's either a
generative model
or one which allows "folding in") with some labels in a Vector wrapper
and maybe some sort of statistics generating utility, which prints out
data about the clustering (number of points per cluster, how wide they are,
the centroids are or other stuff like that).

This is really something true of all of the clustering classes / jobs, not


On Tue, Apr 6, 2010 at 10:30 AM, Ted Dunning <> wrote:

> This isn't far from true.  I was just thinking something along the same
> lines, but phrased a bit differently.
> My thought was that if the concept and output is sooo different, will users
> be able to use it even if the dumper is made to work well?
> On Tue, Apr 6, 2010 at 10:27 AM, Jake Mannix <>
> wrote:
> >
> >  Without this final step, this seems very much like an unfinished
> feature,
> > to the point of being unusable.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message