Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 858E51123D for ; Tue, 29 Jul 2014 16:34:55 +0000 (UTC) Received: (qmail 68078 invoked by uid 500); 29 Jul 2014 16:34:55 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 68030 invoked by uid 500); 29 Jul 2014 16:34:55 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 68020 invoked by uid 99); 29 Jul 2014 16:34:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Jul 2014 16:34:55 +0000 X-ASF-Spam-Status: No, hits=-10.8 required=5.0 tests=ENV_AND_HDR_SPF_MATCH,HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS,USER_IN_DEF_SPF_WL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of thschweiger@ebay.com designates 216.113.175.152 as permitted sender) Received: from [216.113.175.152] (HELO den-mipot-001.corp.ebay.com) (216.113.175.152) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Jul 2014 16:34:49 +0000 DomainKey-Signature: s=ebaycorp; d=ebay.com; c=nofws; q=dns; h=X-EBay-Corp:X-IronPort-AV:Received:Received:From:To: Subject:Thread-Topic:Thread-Index:Date:Message-ID: References:In-Reply-To:Accept-Language:Content-Language: X-MS-Has-Attach:X-MS-TNEF-Correlator:x-originating-ip: Content-Type:MIME-Version:X-CFilter; b=MvsvZ06PboGrgp28OFwTpBtQWSldVXUneycMXZH1uUWVKxXhtnTYY4r/ PtPOdDmA7etjq18bVbAdfeO2vHYTnsNwyheez9HrtXsX5ANCf8Xxae5r3 uUMMq3UdhR4/xiC+rO7+fNxscvxUEs9zEdCSffLacW9eiyJfyIhVPEguF Q=; DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=ebay.com; i=@ebay.com; q=dns/txt; s=ebaycorp; t=1406651689; x=1438187689; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=WhsGcMJBE8O6/QX9RKQ32Af523ELl7bduT8eioDYpwI=; b=jFDX9Tdz8bMMrgmsdGc0W+V3huIFAG92Hg6xhyVY/W1r1Lqwlo6RHLQ2 ZSlzHOsp59cwpsClHNCLxhMwQ7LA1kOu8p/xq8KPyC9CA5pJuxovZLhsG fQ6RxWaJEa/dD1i8bn4EJL7NK51Nb5kBeSk6ofaT5NPu3mjV5eytWUsa/ k=; X-EBay-Corp: Yes X-IronPort-AV: E=Sophos;i="5.01,758,1400050800"; d="scan'208,217";a="59473646" Received: from den-vteml-001.corp.ebay.com (HELO DEN-EXMHT-001.corp.ebay.com) ([10.101.112.212]) by den-mipot-001.corp.ebay.com with ESMTP; 29 Jul 2014 09:34:28 -0700 Received: from DEN-EXDDA-S51.corp.ebay.com ([fe80::f472:bf6:19b:fc43]) by DEN-EXMHT-001.corp.ebay.com ([fe80::345e:2420:7d3d:208d%13]) with mapi id 14.03.0195.001; Tue, 29 Jul 2014 10:34:28 -0600 From: "Schweiger, Tom" To: "user@giraph.apache.org" Subject: RE: Generating unique vertex id's for addVertexRequest Thread-Topic: Generating unique vertex id's for addVertexRequest Thread-Index: AQHPqxXliF2k2681C02QNcutU42wDZu3PN1h Date: Tue, 29 Jul 2014 16:34:28 +0000 Message-ID: <9B590560869E3741B7665052119D024225F412@DEN-EXDDA-S51.corp.ebay.com> References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.241.19.243] Content-Type: multipart/alternative; boundary="_000_9B590560869E3741B7665052119D024225F412DENEXDDAS51corpeb_" MIME-Version: 1.0 X-CFilter: Scanned den1 X-Virus-Checked: Checked by ClamAV on apache.org --_000_9B590560869E3741B7665052119D024225F412DENEXDDAS51corpeb_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable With any generated ID like a hash, there will always be the possibility of = a collision (different ids creating the same generated id). However, becau= se you are using a long, the size of the hash space is quite large. a coll= ision won't become likely until you have around 4 billion vertexes. If you= r graph has, say, 10 million vertexes, you can be 99.97% sure there are no = collisions. Put another way, you would have to generate 3700 graphs. each= with 10 million vertexes, before you got one with a single collision. Your other options are: * Manage your ids, using a cross-reference table, so that you guarantee a o= ne-to-one relationship between the id and the long. * Change the classes you are using in Giraph to use Text instead of Long fo= r the vertex ids. ________________________________ From: Panagiotis Eustratiadis [ep.pan.dit@gmail.com] Sent: Tuesday, July 29, 2014 3:14 AM To: user@giraph.apache.org Subject: Generating unique vertex id's for addVertexRequest Hello everyone, I'm looking for a way to generate unique id's (of type Long) for the addVer= texRequest. For example, a very silly implementation that works for graphs = with less than 100 vertices would look like this: public void compute(Iterable messages) { ... long generatedId =3D generateId(long getId().get()); addVertexRequest(new LongWritable(generatedId), new DoubleWritable(0)); ... } private long generateId(long seed) { return seed + 100; } But as I said, this is just silly. How can I modify the generateId so that = I know the vertex id is unique regardless of the graph size? Panagiotis Eustratiadis. --_000_9B590560869E3741B7665052119D024225F412DENEXDDAS51corpeb_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

With any generated ID like a hash, there will always be the possibility of = a collision (different ids creating the same generated id).  However, = because you are using a long, the size of the hash space is quite large.&nb= sp; a collision won't become likely until you have around 4 billion vertexes.  If your graph has, say, 10 milli= on vertexes, you can be 99.97% sure there are no collisions.  Put anot= her way, you would have to generate  3700 graphs. each with 10 million= vertexes, before you got one with a single collision.

Your other options are:

* Manage your ids, using a cross-reference table, so that you guarantee a o= ne-to-one relationship between the id and the long.

* Change the classes you are using in Giraph to use Text instead of Long fo= r the vertex ids.


From: Panagiotis Eustratiadis [ep.pan.dit= @gmail.com]
Sent: Tuesday, July 29, 2014 3:14 AM
To: user@giraph.apache.org
Subject: Generating unique vertex id's for addVertexRequest

Hello everyone,

I'm looking for a way to generate unique id's (of type Long) for the addVer= texRequest. For example, a very silly implementation that works for graphs = with less than 100 vertices would look like this:

public void compute(Iterable<NullWritable> messages) {
...
    long generatedId =3D generateId(long getId().get());
    addVertexRequest(new LongWritable(generatedId), new Doub= leWritable(0));
...
}

private long generateId(long seed) {
    return seed + 100;
}

But as I said, this is just silly. How can I modify the generateId so = that I know the vertex id is unique regardless of the graph size?

Panagiotis Eustratiadis.
--_000_9B590560869E3741B7665052119D024225F412DENEXDDAS51corpeb_--