hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 朱 偉民 <xim-...@tsm.kddilabs.jp>
Subject RE: I found a bug of the source NetworkTopology::pseudoSortByDistance
Date Mon, 08 Sep 2008 04:39:31 GMT
Hi Hairong,

I have understood that the NetworkTopology.pseudoSortByDistance method's
aims is to be simple and fast.
But There is really a bug.

Test Case:
When the local node is not found and
Only one local rack node is found and
that local rack node's position in the node array is 0

In the case,
localRackNode is 0 and
tempIndex is 0 and
(localRackNode != -1 && localRackNode != tempIndex ) 's result is false.

Therefore a random node will be put in the beginning of the array.

Let's fix it as follows

      // position tempIndex is 0 and local rack node is 0,return with do
anything
      if(tempIndex == 0 && localRackNode == 0){
    	  return;
      }
      
      // swap the local rack node and the node at position tempIndex
      if(localRackNode != -1 && localRackNode != tempIndex ) {

thanks very much

-----Original Message-----
From: Hairong Kuang [mailto:hairong@yahoo-inc.com] 
Sent: Saturday, September 06, 2008 2:24 AM
To: hadoop-dev
Subject: Re: I found a bug of the source
NetworkTopology::pseudoSortByDistance

NetworkTopology.pseudoSortByDistance aims to be simple and fast because it
is used by every open operation. It is not intended to sort the nodes as the
method name indicates. Instead, it searches for local node & local rack node
and put them in the beginning of the array. If none of them is found, put a
random node there. Since most map/reduce jobs read from local nodes or local
rack nodes, this works out pretty fine.
 
Hairong


On 9/4/08 11:08 PM, "朱 偉民" <xim-shu@tsm.kddilabs.jp> wrote:

> Hello, can you help me
> 
> I am a hadoop system's user. I found a bug of
> NetworkTopology::pseudoSortByDistance in version 18.0.
> The bug is:
> 1.When the local node is not found but the local rack node is found and
that
> node's position in the node array is 0, a random node at position 0 is
put.
> 2.When the local node is not found and the local rack node is not found
but
> the local datacenter node is found A random node at position 0 is put 3.
> When it comes near most and there are two or more data node most, hadoop
> can't read the data from a node arbitrary by random numbers in that
> 
> I changed the source code for fix the bug. But I can't submit the source
> code file to hadoop server.
> 
> The source is:
> 
>   public synchronized void pseudoSortByDistance( Node reader, Node[] nodes
)
> {
>  
>  if(nodes.length == 0)return;
>  
>  if(reader != null){
>  int distances[] = new int[nodes.length];
>  // get their distances to reader
>  for(int i=0;i<nodes.length; i++){
>  distances[i] = getDistance(reader,nodes[i]);
>  }
>  // Sort nodes array by their distances to reader
>  for(int i=0;i<distances.length; i++){
>  for(int j=i+1;j<distances.length;j++){
>  if(distances[i] > distances[j] ){
>  swap(nodes,i,j);
>  }
>  }
>  }
>  /**
>   *  put a random node at position 0 from the nodes of
>   *  that is equal with the first node's distance to reader
>   */  
>  int i;
>  for(i=0;i<distances.length;i++){
>  if(distances[i] != distances[0])break;
>  }
>  if(i != 0)swap(nodes, 0, r.nextInt(i));
>  
>  }else{ // put a random node at position 0 if reader is null
>  swap(nodes, 0, r.nextInt(nodes.length));
>  }
>   }
> 
> 




Mime
View raw message