hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Praveen Sripati <praveensrip...@gmail.com>
Subject Canopy Clustering on BSP
Date Sat, 07 Apr 2012 13:19:23 GMT

After Thomas implementation of K-Means (3) I was motivated to extend it
using the Canopy clustering. So, I started looking at the MR implementation
of Canopy (1) and (2). The MR implementation of Canopy clustering is done
in two MR phases, first one to identify the canopies and second to assign
canopies to the data points. I don't see much improvement when this is done
using BSP. Please correct me if I am wrong.

Also, are there any algorithms which can implemented easily (for those who
are getting started with Hama/BSP like me) on Hama/BSP where we could also
see some performance improvements when compared to the MR implementation. I
have seen Mahout and there are many algorithms implemented in it and would
like to see something similar in Hama also.


(1) -
(2) - https://cwiki.apache.org/confluence/display/MAHOUT/Canopy+Clustering
(3) -

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message