giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun Kumar <>
Subject Giraph input partition
Date Wed, 30 Apr 2014 08:36:45 GMT

I was able to run shortest path algorithm for some small input size, but
after that I extended the input data to 1 GB, then, we are getting the
“java heap out of memory” runtime exception.

We only know that the default mechanism to assign a vertex to the running
Map Instance in Giraph is via hash partition of the vertex. But we do not
know how such Giraph hash partitioning mechanism to be translated into how
to prepare the large input data, so that  V1 is assigned to Partition 1,
and V2 is assigned to Partition 2. Such input data partition will have to
be agreed by Giraph runtime, so that at runtime, when Partition 1 is
assigned to Map 1, V1’s send-message result to V2, will have the message to
be correctly routed to Map 2 that is assigned to process Partition 2.

Can somebody tell us how to prepare the input data (at least > 64 MB, the
default partition size), partitions?



View raw message