flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hajira Jabeen <hajirajab...@gmail.com>
Subject Understanding Kmeans in Flink
Date Mon, 28 Dec 2015 16:21:32 GMT
Hello everyone,

I am trying to understand Kmeans in Flink, Scala.

I can see that the attached Kmeans-snippet (taken from Flink examples)
updates centroids.

in (1) map function assigns points to centroids,
in (3) centroids are grouped by their ids.
in (4) the x and y coordinates are being added

But, I cannot understand what happens at (2) and then (5) ?
I will really appreciate, if any one can elaborate how this works ?


Thanks
Hajira

-------------------------------------
K means code snippet
--------------------------------------
val newCentroids = points
1)        .map(new
SelectNearestCenter()).withBroadcastSet(currentCentroids, "centroids")
2)        .map { x => (x._1, x._2, 1L) }
3)        .groupBy(0)                                 // by centroid ID
4)        .reduce { (p1, p2) => (p1._1, p1._2.add(p2._2), p1._3 + p2._3) }
5)        .map { x => new Centroid(x._1, x._2.div(x._3)) }
      newCentroids
--------------------------------------

Mime
View raw message