giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kenrick Fernandes <>
Subject Re: Worker to task id mapping in giraph
Date Mon, 27 Nov 2017 19:56:25 GMT

As far as I know,  there is no way to do this (atleast not without changing
the network names of the machines which is outside the scope of the Giraph
However, a simple solution might be to make a mapping data structure that
stores the underlying node IDs and then you can access them contiguously.
Alternatively, exclude nodes only from the start or the end so that the
is preserved.


On Mon, Nov 27, 2017 at 12:54 AM, Ravikant Dindokar <
> wrote:

> Hi,
> I am trying to change the mapping of partitions to workers in
> createInitialPartitionOwners(Collection<WorkerInfo> availableWorkerInfos,
> int maxWorkers) defined in org.apache.giraph.partition.
> The first argument to this method is the list of available workers and
> each worker has a task id associated with it.
> When I am specifying the number of workers as 40, the task id are 1 to 40.
> I have removed one node from the yarn-cluster by following this answer
> separately-specify-a-set-of-nodes-for-hdfs-and-others-for-mapreduce-jobs.
> After this change, the task ids assigned to the workers are no longer
> consecutive (e.g. for 40 workers the range of task id is 1- 43). Some
> task-ids are excluded.
> Is there any way to get these task ids strictly consecutive?
> Thanks
> Ravikant

View raw message