mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liang Chenmin <liangchenmi...@gmail.com>
Subject A question about the naming of the cluster and points in synthetic data cluster
Date Wed, 25 Nov 2009 00:17:16 GMT
Hi all,
    I am a newbie to Mahout. I have a question about how to incorporate some
naming for cluster and points in the synthetic data cluster example.

    After getting the output of the synthetic data cluster, we have 6
clusters, and each one looks like:

###First is the information of the cluster
0:name::{"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\":[0,1,2...59],\"values\":[29.58838112577385,...],\"numMappings\":60},\"cardinality\":60,\"lengthSquared\":-1.0,\"name\":\"\"}"}

###And then follow by points belong to this cluster:
Points:
{"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\":[0,1,2,...,59],\"values\":[28.7812,34.4632,......
],],\"numMappings\":60},\"cardinality\":60,\"lengthSquared\":-1.0,\"name\":\"\"}"},

{"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\"
....


Is there a way for me to specify the name of the cluster? And more
importantly, if I actually have ID for each point, how could I show the ID
for each point in the final result? I want to see clearly the IDs in each
cluster. I have used my own data also, and the output is similar to the ones
above, although the indices are not the same as my matrix are sparse. And as
my data set is large, getting the IDs is quite important for me.

Thanks,
Mandy

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message