mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul raghavendhra <rahulraghavendh...@gmail.com>
Subject Re: Help using mahout for k-means clustering on existing vectors
Date Wed, 11 Jan 2012 10:31:06 GMT
Hi all,

 i have run that org.apache.mahout.clustering.syntheticcontrol.<>.Job
successfully..

 when i run with similar dataset(double values separated by ' ' (space))..

i got the error  org.apache.mahout.math.CardinalityException: Required
cardinality 16 but got 91

How this Cardinality is calculated and how it is passed  to kmeans driver..
how to calculate the cardinality for any dataset ?

please help




./rahul


On Tue, Jan 10, 2012 at 9:31 AM, Grant Ingersoll <gsingers@apache.org>wrote:

> The CSVVectorIterator will get you vectors from a CSV file, then you just
> need to write them out to the SequenceFile.  All you need is a driver that
> wraps the SequenceFileVectorWriter and calls the write method.
>
>
> On Jan 9, 2012, at 2:50 PM, Daniel Quach wrote:
>
> > I have a file of vectors I formulated in csv format, and I want to use
> mahout to perform k-means clustering on the vectors in this file.
> >
> > However, it seems mahout expects the input data to be formatted in a
> SequenceFile format, and I'm not sure if there's a way to easily do this
> (are there existing tools?)
>
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message