I am reading Hadoop Definitive Guide 2nd Edition and I am struggling
to figure out the exact
Hadoop's formula for network distance calculation (page 64/65). (I
have my guesses, but I would like to know the exact formula)
There is an example showing following distances:
For example, imagine a node n1 on rack r1 in data center d1.
This can be represented as /d1/r1/n1.
Using this notation, here are the distances for the four scenarios:
• distance(/d1/r1/n1, /d1/r1/n1) = 0 (processes on the same node)
• distance(/d1/r1/n1, /d1/r1/n2) = 2 (different nodes on the same rack)
• distance(/d1/r1/n1, /d1/r2/n3) = 4 (nodes on different racks in the
same data center)
• distance(/d1/r1/n1, /d2/r3/n4) = 6 (nodes in different data centers)
and there is illustration there as well.
Here is the link to the illustration:
http://books.google.com/books?id=Nff49D7vnJcC&lpg=PA65&ots=IidrYuayXs&dq=hadoop%20network%20distance%20calculation&pg=PA65#v=onepage&q=hadoop%20network%20distance%20calculation&f=false
If different rack is 4 and same one is 2 what would be the distance of
other nodes that are on the same rack? 2 as well? Can distance be 1?
Thank you,
Edmon
http://it.toolbox.com/blogs/lim
