giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Schweiger, Tom" <thschwei...@ebay.com>
Subject RE: Generating unique vertex id's for addVertexRequest
Date Tue, 29 Jul 2014 16:34:28 GMT

With any generated ID like a hash, there will always be the possibility of a collision (different
ids creating the same generated id).  However, because you are using a long, the size of the
hash space is quite large.  a collision won't become likely until you have around 4 billion
vertexes.  If your graph has, say, 10 million vertexes, you can be 99.97% sure there are no
collisions.  Put another way, you would have to generate  3700 graphs. each with 10 million
vertexes, before you got one with a single collision.

Your other options are:

* Manage your ids, using a cross-reference table, so that you guarantee a one-to-one relationship
between the id and the long.

* Change the classes you are using in Giraph to use Text instead of Long for the vertex ids.


________________________________
From: Panagiotis Eustratiadis [ep.pan.dit@gmail.com]
Sent: Tuesday, July 29, 2014 3:14 AM
To: user@giraph.apache.org
Subject: Generating unique vertex id's for addVertexRequest

Hello everyone,

I'm looking for a way to generate unique id's (of type Long) for the addVertexRequest. For
example, a very silly implementation that works for graphs with less than 100 vertices would
look like this:

public void compute(Iterable<NullWritable> messages) {
...
    long generatedId = generateId(long getId().get());
    addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0));
...
}

private long generateId(long seed) {
    return seed + 100;
}

But as I said, this is just silly. How can I modify the generateId so that I know the vertex
id is unique regardless of the graph size?

Panagiotis Eustratiadis.

Mime
View raw message