giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arjun Sharma <>
Subject Giraph Partitioning
Date Wed, 25 Feb 2015 02:25:17 GMT

I understand that by default, the number of partitions = number of workers
^ 2. So, if we have N workers, each worker will process N partitions. I
have a number of questions:

1- By default, does Giraph process the N partitions within a single worker
sequentially? If yes, when setting the parameter giraph.numComputeThreads,
will partitions within each thread be computed sequentially?

2- By default, does Giraph keep all partitions in memory?

3- If the answers to 1 and 2 are yes and yes, is there any advantage from
using multiple partitions versus a single partition in the case of single
threading per worker?

3- How does the out-of-core partitions affect out-of-core messages? Are
they completely independent? For example, if the number of partitions to be
kept in memory is set to a number less than N, and at the same time all
messages are set to be kept in memory, will ALL messages be kept in memory,
even those from out-of-core partitions? If the situation is reversed, where
all partitions are kept in memory, and out-of-core messaging is set, will
messages from memory-based partitions be saved on disk?

4- Is there a class like a PartitionContext, where you can access
preSuperstep and postSuperstep *per partition*, along the lines of

View raw message