giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: Basic questions about Giraph internals
Date Thu, 06 Feb 2014 10:28:37 GMT
Hi Alex,

answers are inline.


On Thu, Feb 6, 2014 at 11:22 AM, Alexander Frolov
<alexndr.frolov@gmail.com>wrote:

> Hi, folks!
>
> I have started small research of Giraph framework and I have not much
> experience with Giraph and Hadoop :-(.
>
> I would like to ask several questions about how things are working in
> Giraph which are not straightforward for me. I am trying to use the sources
> but sometimes it is not too easy ;-)
>
> So here they are:
>
> 1) How Workers are assigned to TaskTrackers?
>

Each worker is a mapper, and mapper tasks are assigned to tasktrackers by
the jobtracker. There's no control by Giraph there, and because Giraph
doesn't need data-locality like Mapreduce does, basically nothing is done.


>
> 2) How vertices are assigned to Workers? Does it depend on distribution of
> input file on DataNodes? Is there available any choice of distribution
> politics or no?
>

In the default scheme, vertices are assigned through modulo hash
partitioning. Given k workers, vertex v is assigned to worker i according
to hash(v) % k = i.


>
> 3) How Workers and Map tasks are related to each other? (1:1)? (n:1)?
> (1:n)?
>

It's 1:1. Each worker is implemented by a mapper task. The master is
usually (but does not need to) implemented by an additional mapper.


>
> 4) Can Workers migrate from one TaskTracker to the other?
>

Workers does not migrate. A Giraph computation is not dynamic wrt to
assignment and size of the tasks.


>
> 5) What is the best way to monitor Giraph app execution (progress, worker
> assignment, load balancing etc.)?
>

Just like you would for a standard Mapreduce job. Go to the job page on the
jobtracker http page.


>
> I think this is all for the moment. Thank you.
>
> Testbed description:
> Hardware: 8 node dual-CPU cluster with IB FDR.
> Giraph: release-1.0.0-RC2-152-g585511f
> Hadoop: hadoop-0.20.203.0, hadoop-rdma-0.9.8
>
> Best,
>    Alex
>



-- 
   Claudio Martella

Mime
View raw message