Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 55637101AB for ; Thu, 21 Nov 2013 23:44:24 +0000 (UTC) Received: (qmail 68971 invoked by uid 500); 21 Nov 2013 23:44:24 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 68934 invoked by uid 500); 21 Nov 2013 23:44:24 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 68926 invoked by uid 99); 21 Nov 2013 23:44:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Nov 2013 23:44:24 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of claudio.martella@gmail.com designates 74.125.82.177 as permitted sender) Received: from [74.125.82.177] (HELO mail-we0-f177.google.com) (74.125.82.177) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Nov 2013 23:44:18 +0000 Received: by mail-we0-f177.google.com with SMTP id p61so461167wes.22 for ; Thu, 21 Nov 2013 15:43:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=7kxL9etBRR6Cv1qiJWyzuvDW3umDpApP6PQ0pX1hm1k=; b=bTGf/oC/8m+SqhAFsNQPej/AMseX/i+dfAHd1NMRGsnCO7E/y7olgZq7psvfQ5DglM uIaPRpgrIS2KANoOFmSoTFmuxwj4uAV/hLO68Odav38j5gJ535qCgyuNR6Zyn4UJHXDN YQ+b/S8dQb6YxhaYldnz4aNZp7QDUYGmRbMSMcNptY1vHZxBssnJUc0HgPs09OjuErZk byp6okP8H6d4hVG2Y0I+hdmRQZkuZqT2l0BmLe8qYnwcnWx9TTKb3QxNRQ4iB7EHI/8O tIpbonpjkLVSL5urh1jU1yNykGg8h2twWUnOBEQqsUHHXq+WaHxeJ1ag7muJb+ExiXdk VWzQ== X-Received: by 10.180.97.5 with SMTP id dw5mr140306wib.42.1385077437598; Thu, 21 Nov 2013 15:43:57 -0800 (PST) MIME-Version: 1.0 Received: by 10.216.19.10 with HTTP; Thu, 21 Nov 2013 15:43:37 -0800 (PST) In-Reply-To: References: From: Claudio Martella Date: Fri, 22 Nov 2013 00:43:37 +0100 Message-ID: Subject: Re: Waking up all the vertices after every vertex calls vote to halt To: "user@giraph.apache.org" Content-Type: multipart/alternative; boundary=f46d04430638a793c604ebb878b7 X-Virus-Checked: Checked by ClamAV on apache.org --f46d04430638a793c604ebb878b7 Content-Type: text/plain; charset=ISO-8859-1 The simplest thing, is that you get a flag for each vertex to signal whether they are really active. If not, they return. This means that vertices never really vote to halt. Computationally, it does not cost you much more than this check. You can play the rest of the logics with some aggregators and the master compute. On Thu, Nov 21, 2013 at 11:57 PM, Ameya Vilankar wrote: > Hi, > I have implemented Alternating Least Squares on top apache giraph. On the > edge, I store the type of the edge. Edges can be either a training edge or > testing edge. When I run the algorithm, I use only the ratings on the > training edge to tune the vectors on the vertices. > The algorithm ends in one of the two scenarios: > 1. All the vertices have tuned their vector with in the tolerable error. > At this point there are no active vertices since everyone has called vote > to halt. > 2. We reached the maximum number of supersteps. At this point, some > vertices are active since they received messages from the last superstep. > > I have written an Aggregator that counts the training error along this > process. But now, I want to calculate the prediction/testing error which is > along the testing labelled edges. But there are either no active vertices > or few active vertices at this point in my algorithm. I need all the > vertices to send their vectors along all of their testing edges to compute > the testing error and send it to a error sum aggregator. For this I need to > activate all the vertices. > Hope it is clear to you now. > > Thanks, > Ameya. > Zynga > > > On Thu, Nov 21, 2013 at 2:45 PM, Claudio Martella < > claudio.martella@gmail.com> wrote: > >> Hi Ameya, >> >> I'm not sure I understand the problem correctly. The maximum number of >> supersteps allows you to halt the computation when that threshold is >> reached. The RMSE can be computed within the master compute. >> >> What do you want to achieve exactly? >> >> >> On Thu, Nov 21, 2013 at 10:47 PM, Ameya Vilankar < >> ameya.vilankar@gmail.com> wrote: >> >>> Hi, >>> I am implementing a machine learning algorithm on top giraph. The >>> algorithm converges when all the vertices call voteToHalt or some max >>> number of supersteps have completed. >>> I want to calculate the RMSE error after the algorithm has converged. >>> But the problem is either all the vertices have called vote to halt or (in >>> the case where we reach max supersteps) only some of them are still active. >>> I need to reactivate or wake up all the vertices. Is there any way in >>> giraph that I could do this? >>> >>> Thanks, >>> Ameya Vilankar >>> Zynga >>> >> >> >> >> -- >> Claudio Martella >> claudio.martella@gmail.com >> > > -- Claudio Martella claudio.martella@gmail.com --f46d04430638a793c604ebb878b7 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
The simplest thing, is that you get a flag for each vertex= to signal whether they are really active. If not, they return. This means = that vertices never really vote to halt. Computationally, it does not cost = you much more than this check. You can play the rest of the logics with som= e aggregators and the master compute.


On Thu, Nov 21, 2013 at 11:57 PM, Ameya = Vilankar <ameya.vilankar@gmail.com> wrote:
Hi,
I have implemented = Alternating Least Squares on top apache giraph. On the edge, I store the ty= pe of the edge. Edges can be either a training edge or testing edge. When I= run the algorithm, I use only the ratings on the training edge to tune the= vectors on the vertices.=A0
The algorithm ends in one of the two scenarios:
1. All the v= ertices have tuned their vector with in the tolerable error. At this point = there are no active vertices since everyone has called vote to halt.
2. We reached the maximum number of supersteps. At this point, some ve= rtices are active since they received messages from the last superstep.

I have written an Aggregator that counts the training= error along this process. But now, I want to calculate the prediction/test= ing error which is along the testing labelled edges. But there are either n= o active vertices or few active vertices at this point in my algorithm. I n= eed all the vertices to send their vectors along all of their testing edges= to compute the testing error and send it to a error sum aggregator. For th= is I need to activate all the vertices.
Hope it is clear to you now.

Thanks,
Ameya.
Zynga




--
= =A0 =A0Claudio Martella
=A0 =A0claudio.martella@gmail.com=A0 =A0 --f46d04430638a793c604ebb878b7--