giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Kumar A <pava...@outlook.com>
Subject RE: Optimal number of Workers
Date Wed, 16 Apr 2014 08:28:15 GMT
Giraph uses threads for compute, netty server, netty client on workers, execution pools, input,
output etc.You can see most of these options in org.apache.giraph.conf.GiraphConstants for
instance
  /** Netty client threads */  IntConfOption NETTY_CLIENT_THREADS =      new IntConfOption("giraph.nettyClientThreads",
4, "Netty client threads");
  /** Netty server threads */  IntConfOption NETTY_SERVER_THREADS =      new IntConfOption("giraph.nettyServerThreads",
16,          "Netty server threads");
  /** Number of threads for vertex computation */  IntConfOption NUM_COMPUTE_THREADS =   
  new IntConfOption("giraph.numComputeThreads", 1,          "Number of threads for vertex
computation");
  /** Number of threads for input split loading */  IntConfOption NUM_INPUT_THREADS =    
 new IntConfOption("giraph.numInputThreads", 1,          "Number of threads for input split
loading");

The idea is that if you run your job in a cluster of 5 machines: typically 1 machine is the
master & 4 of them are "workers" which load the graph & compute on it. Each worker
is a separate machine and to maximize its utilization we can use as many threads as it can
handle.
However, if you are running it in pseudo mode then all workers run on the same machine &
still try to launch the number of threads (default set in the config) - though each worker
is now a thread (instead of a machine) it still launches all these other threads unscrupulously.
Anyway, u can configure these threads spawned by workers to reduce the over all number of
threads launched in your one machine.
From: chadijaber986@hotmail.com
To: user@giraph.apache.org
Subject: Optimal number of Workers
Date: Tue, 15 Apr 2014 13:34:53 +0200




Hello !!Can anybody explain how threads are used by worker in Giraph ? for which purposes
? how the number of thread to use is determined by worker?
I often have the following error :org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError:
unable to create new native thread.
A check on the number of thread by worker gives child processes with 100 threads by worker
process (10 workers in a 12 processors machine), which is in my opinion too large isn't it
?if i reduce the number of workers , the number of threads decreases. How must we choose the
number of workers?
Thanks in advance.Chadi

 		 	   		   		 	   		  
Mime
View raw message