* [http://www.slideshare.net/steve_l/graphs-1848617 Graphs] Paolo Castagna, HP
This was a talk by Paolo Castagna on graph work under MR, of which PageRank is classic application
+ * graph topology does not change every iteration, so why ship it around every MR?
+ * the graph defines the other jobs you need to communicate with.
The graph is a massive data structure which, if you are doing inference work, only grows in relationships. Steve thinks: You may need some graph model which is shared across servers, which they can all add to. There is a small problem here: keeping the information current for 4000 servers, but what if you don't have to, what if you treat updates to the graph as lazy facts to propagate round?
Google: pregel. what do you need from a language to describe PageRank in 15 lines?