Return-Path: X-Original-To: apmail-incubator-giraph-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-giraph-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D9559AA6 for ; Tue, 24 Apr 2012 15:54:11 +0000 (UTC) Received: (qmail 82281 invoked by uid 500); 24 Apr 2012 15:54:11 -0000 Delivered-To: apmail-incubator-giraph-user-archive@incubator.apache.org Received: (qmail 82252 invoked by uid 500); 24 Apr 2012 15:54:11 -0000 Mailing-List: contact giraph-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: giraph-user@incubator.apache.org Delivered-To: mailing list giraph-user@incubator.apache.org Received: (qmail 82244 invoked by uid 99); 24 Apr 2012 15:54:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Apr 2012 15:54:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of janlugt@gmail.com designates 209.85.215.175 as permitted sender) Received: from [209.85.215.175] (HELO mail-ey0-f175.google.com) (209.85.215.175) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Apr 2012 15:54:03 +0000 Received: by eaae1 with SMTP id e1so202370eaa.6 for ; Tue, 24 Apr 2012 08:53:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=amW57Rla8D90FyjBRSLvYNIpNBggN/vmFDeJpUAzPmo=; b=YwEAv6NZO+18SDkXQWkr7tux2gpiPtFFoZHmkUbYh1p98ZtJ6ZSvGGj6UzXCrYijO2 achGPpUglhOD1BAMTGKAZbbvw6Dmi5tzZWpID8fMjleCeiyFOLr8ZJQ3Mdp9A34xrEBY rnFebbNS6UV8Tp0ToSehhY1/V2+pEld+Y1bdu68NpMokKpxw6W+iuPQ9yYFqw/UIdOnW ZNuYNt7YRq5iMKtgZSeY9vLRVVTC8G8mu7lPnKdw3LxJ1hjGCMvruM2zJW1hQLq/4+ye 84PRLSFqIZAYIhO9JxhpNgZtHyMR9mjnn2UtBSSETpOBGnoiwyzJA11CsB18rS1eZPWx ABng== MIME-Version: 1.0 Received: by 10.14.22.67 with SMTP id s43mr1842837ees.110.1335282822677; Tue, 24 Apr 2012 08:53:42 -0700 (PDT) Received: by 10.213.29.9 with HTTP; Tue, 24 Apr 2012 08:53:42 -0700 (PDT) Date: Tue, 24 Apr 2012 08:53:42 -0700 Message-ID: Subject: Missing features in Giraph From: Jan van der Lugt To: giraph-user@incubator.apache.org Content-Type: multipart/alternative; boundary=90e6ba5bb86f5204b104be6ec27e --90e6ba5bb86f5204b104be6ec27e Content-Type: text/plain; charset=ISO-8859-1 Dear all, After having worked with Giraph for some weeks I feel like there are two features 'missing' in Giraph. It may be I simply missed them in the Javadoc, since the documentation is a work in progress at this point. In another Google Pregel-clone, Stanford GPS, it is possible to define a global object map, which can be used by all workers to share data, like the current phase in the algorithm. I have not been able to find such a feature in Giraph. Of course it would be possible to (ab)use aggregators for this, but I doubt this is the easiest or most efficient approach. Furthermore, it would be very helpful if there would be one special vertex that has the role of a master. This should not have to correspond to an existing vertex in the graph, it would be easier if it were not, actually. This master node would then be able to perform some centralized steps in the algorithm, of which the output can then be shared with other workers via the global object map. The master node could have the same interface as the workers (compute(), getAggregator(), getConf(), etc.). Again, it would be possible to solve this otherwise, for example in the VertexReader, but this would make code less elegant and would require picking a vertex id that does not exist in the graph, which is difficult if the input is not known in advance. I realize I am biased because my earlier experiences with Stanford GPS, but I feel these features will not be very difficult to implement or would add bulkiness to the API. They can make the implementation of many graph algorithms easier, though, because many of these algorithms have some notion of a centralized master node. During the next 5 months I will be working with Giraph for my Master's project, so I would be more than willing to help out implementing these features, ideally after receiving some pointers from more experienced Giraph developers. Regards, Jan van der Lugt --90e6ba5bb86f5204b104be6ec27e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Dear all,

After having worked with Giraph for some wee= ks I feel like there are two features 'missing' in Giraph. It may b= e I simply missed them in the Javadoc, since the documentation is a work in= progress at this point.=A0In another Google Pregel-clone, Stanford GPS, it= is possible to define a global object map, which can be used by all worker= s to share data, like the current phase in the algorithm. I have not been a= ble to find such a feature in Giraph. Of course it would be possible to (ab= )use aggregators for this, but I doubt this is the easiest or most efficien= t approach. Furthermore, it would be very helpful if there would be one spe= cial vertex that has the role of a master. This should not have to correspo= nd to an existing vertex in the graph, it would be easier if it were not, a= ctually. This master node would then be able to perform some centralized st= eps in the algorithm, of which the output can then be shared with other wor= kers via the global object map. The master node could have the same interfa= ce as the workers (compute(), getAggregator(), getConf(), etc.). Again, it = would be possible to solve this otherwise, for example in the VertexReader,= but this would make code less elegant and would require picking a vertex i= d that does not exist in the graph, which is difficult if the input is not = known in advance.

I realize I= am biased because my earlier experiences with Stanford GPS, but I feel the= se features will not be very difficult to implement or would add bulkiness = to the API. They can make the implementation of many graph algorithms easie= r, though, because many of these algorithms have some notion of a centraliz= ed master node. During the next 5 months I will be working with Giraph for = my Master's project, so I would be more than willing to help out implem= enting these features, ideally after receiving some pointers from more expe= rienced Giraph developers.

Regards,
Jan van der Lugt
--90e6ba5bb86f5204b104be6ec27e--