giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <>
Subject Re: PageRankBenchmark scaling
Date Fri, 11 Jan 2013 01:07:29 GMT
I suggest you start by trying the ByteArrayPartition and continue with
out of core messages and/or graph.
Also, make sure the mapper tasks can get enough memory on the heap in
the hadoop cluster configuration.

On Thu, Jan 10, 2013 at 9:23 PM, Pradeep Gollakota <> wrote:
> Hi All,
> I'm trying to run some benchmarks using the PageRankBenchmark tool on my
> cluster. However, I'm seeing some scaling issues.
> My cluster has 4 nodes, configured to run 24 map tasks. I'm running the
> benchmark with 23 workers. I've been able to get it scale up to 256 million
> edges (16m vertices with 16 edges per vertex). However, when I try to scale
> higher than that, I've been getting GC Overhead limit exceeded errors. I
> tried to modify the PageRankComputation class to try to use object reuse,
> but to no avail.
> Does anyone have any thoughts on how I can scale this higher on my cluster?
> I'm trying to get to about 50 million vertices with 150 edges per vertex
> (7.5 billion edges).
> Thanks
> Pradeep

   Claudio Martella

View raw message