giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Neha Raj <neharaj...@gmail.com>
Subject Re: Request for information on Giraph custom Partitioner using external service
Date Fri, 13 Jul 2018 13:50:03 GMT
Hi Ravikant,

Thanks for responding to my query. This definitely helped me validate the
steps I did for creating creating a custom partitioner.
I am currently looking for a mechanism in Giraph which lets me hold a
global value to be used by giraph application across the distributed
system; like hadoop counter for hadoop jobs.

I read about Giraph Aggregators; which does the same thing for Giraph jobs,
but I am still figuring out a way to invoke aggregator from my custom
partitioner class. The examples I see normally calls Aggregators from
Computation classes only. Any pointers here would be helpful ! or if there
is any alternative way of maintaining a global variable across the workers
in Giraph, please do let me know

Best Regards,
Neha Raj

On Tue, Jul 10, 2018 at 2:22 PM, Neha Raj <neharaj.06@gmail.com> wrote:

> Hi,
>
> I am working on a Graph Partitioning algorithms, and have chosen Giraph as
> a Graph processing system to run Graph problems, and very new to both.I
> would like to provide external partitioning information(in the form of txt
> file) to Giraph. For this I have created a custom partition (something like
> HashPartitionFactory), which reads the external file for graph partition Id.
>
> While debugg I realize that this parition logic is invoked several times
> (during the Giraph supersteps) ,and reading the same external file multiple
> times is not time efficient. To handle this I wish to create a
> global(across distributed system) Map variable which holds {vertex Id ,
> partition Id} as a key value pair, and I want to populate this variable
> from external file one time during a Giraph job run. I have tried several
> ways to create & intialize such a global variable but the fact that global
> variable will be populated for a Giraph job is very non deterministic (i.e
> sometime the map is populated with value, sometimes not).
>
> I think there might be some issue in how I am creating the Map variable
> and initializing it to be invoked before My custom Partitioning logic calls
> it. Can somebody please guide me the correct place to plugin this piece of
> information to a Giraph job; and possibly a correct way of creating a
> global variable with respect to Giraph distributed processing
>
> Thanks & Regards,
> Neha
>

Mime
View raw message