mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <>
Subject Re: Help using mahout for k-means clustering on existing vectors
Date Mon, 09 Jan 2012 21:52:38 GMT
The Synthetic Control examples use a similar (but space delimited) input 
format and there is an InputDriver in integration/ which can convert 
those files into Mahout Vector sequence files. You could easily modify 
the InputMapper to be comma delimited or modify your own file formats to 
use spaces.

On 1/9/12 12:50 PM, Daniel Quach wrote:
> I have a file of vectors I formulated in csv format, and I want to use mahout to perform
k-means clustering on the vectors in this file.
> However, it seems mahout expects the input data to be formatted in a SequenceFile format,
and I'm not sure if there's a way to easily do this (are there existing tools?)

  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message