mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Hall <>
Subject From example to job
Date Thu, 12 Jul 2012 19:30:00 GMT

I'm trying to jump from the examples in mahout to a practical job of my
very own. First, I'm very new to mahout but I do have some experience with
machine learning, clustering, and classifications.

My goal: To get KMeans clusters of time-based use from structured data

Example Input:
John Doe,1324,1233,2234,1267,1456,1745,1212

There's a name and a variable series of numbers that correspond to time in
seconds to complete an operation. The times are pre-filtered > 1200 and
built by date/time (pivoted into nameless columns) of the operation, but
the date/time is not relevant to my goal.

Can someone point me toward any resources that explain, not how to run an
example, but how the examples were put together?

If not a resource, how about a high-level description on what mahout is
looking for and how it does, say a KMeans cluster analysis.

Finally, can someone describe a mahout vector and vector file? A
description plus the actual format of a vector row/file.

Robert Hall

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message