Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 382599BD2 for ; Thu, 30 May 2013 10:54:38 +0000 (UTC) Received: (qmail 96285 invoked by uid 500); 30 May 2013 10:54:38 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 95842 invoked by uid 500); 30 May 2013 10:54:31 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 95564 invoked by uid 99); 30 May 2013 10:54:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 May 2013 10:54:18 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of marsty5@gmail.com designates 209.85.160.42 as permitted sender) Received: from [209.85.160.42] (HELO mail-pb0-f42.google.com) (209.85.160.42) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 30 May 2013 10:54:14 +0000 Received: by mail-pb0-f42.google.com with SMTP id uo1so135419pbc.15 for ; Thu, 30 May 2013 03:53:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=slG/nEd7dAGBFtiIiR9XvZsIOiqYTfO4ycp91SSoMhE=; b=iR+PcehDrnGl50oMFW07bXuEmx1lxRzEZYoIZaXm4eI4wUskSZffDpNIbCkcD/rGuC 7IeU0JDJYhBESgfszfv+q7fQEPQ+ABIzTKG1hmDgy5G8jWD2XVx2GE5A07km8OD+VkIk KOm0mhc0yoyUxOS0WNtUWT2iZedlZUMwWS7VyxrPUjqYcnBxJiDWXqDcmkdrN9w/K6IJ P1r+aj4MKNageqbaBFV8OjOMH311LjJAb8K+ZR2VCJOxye1cQoaALjkqyx3klyPYpXv6 fQA/rYswxNB5TvliGlvd0QAU2gXqHXZ38WgyHiRv7IWF/R0xHtjEQzdPdRcRSiVjnO5t 0vpw== X-Received: by 10.68.59.167 with SMTP id a7mr7236253pbr.94.1369911234173; Thu, 30 May 2013 03:53:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.70.8.97 with HTTP; Thu, 30 May 2013 03:53:33 -0700 (PDT) From: Maria Stylianou Date: Thu, 30 May 2013 12:53:33 +0200 Message-ID: Subject: SimplePageRankComputation - The Reader behaves not as expected To: user@giraph.apache.org Content-Type: multipart/alternative; boundary=bcaec531470b7ca39a04dded5029 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec531470b7ca39a04dded5029 Content-Type: text/plain; charset=ISO-8859-1 Hey guys, I was playing with PageRank these days, and had some weird results. I wanted to use the Input Format Reader and Output Format Writer given inside SimplePageRankComputation, so I gave my input file and called the specific code in the command line. Some of the vertices got value > 1. So I had a look in the logs, and noticed that it is generating its own vertices and my input file is never used. The Vertex class inside the SimplePageRankVertexReader has the following lines: LongWritable vertexId = new LongWritable( (inputSplit.getSplitIndex() * totalRecords) + recordsRead); DoubleWritable vertexValue = new DoubleWritable(vertexId.get() * 10d); long targetVertexId = (vertexId.get() + 1) % (inputSplit.getNumSplits() * totalRecords); float edgeValue = vertexId.get() * 100f; And in the Task Logs, it prints 2013-05-30 11:28:20,808 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=0, vertexValue=0.0, targetVertexId=1, edgeValue=0.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=1, vertexValue=10.0, targetVertexId=2, edgeValue=100.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=2, vertexValue=20.0, targetVertexId=3, edgeValue=200.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=3, vertexValue=30.0, targetVertexId=4, edgeValue=300.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=4, vertexValue=40.0, targetVertexId=5, edgeValue=400.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=5, vertexValue=50.0, targetVertexId=6, edgeValue=500.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=6, vertexValue=60.0, targetVertexId=7, edgeValue=600.0 2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=7, vertexValue=70.0, targetVertexId=8, edgeValue=700.0 2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=8, vertexValue=80.0, targetVertexId=9, edgeValue=800.0 2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=9, vertexValue=90.0, targetVertexId=0, edgeValue=900.0 Could someone explain why is this happening? Thanks! -- Maria Stylianou Intern at Telefonica, Barcelona, Spain Master Student of European Master in Distributed Computing marsty5.wordpress.com --bcaec531470b7ca39a04dded5029 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hey guys,

I was playing with PageRank t= hese days, and had some weird results.=A0
I wanted to use the Inp= ut Format Reader and Output Format Writer given inside SimplePageRankComput= ation, so I gave my input file and called the specific code in the command = line. =A0

Some of the vertices got value > 1. So I had a= look in the logs, and noticed that it is generating its own vertices and m= y input file is never used.
The Vertex class inside the=A0S= implePageRankVertexReader has the following lines:
LongWritable vertexId = =3D new LongWritable(
= =A0 =A0 =A0 =A0 =A0 (inputSplit.getSplitIndex() * totalRecords) + recordsRe= ad);
=A0 =A0 =A0 DoubleWritable vertexVa= lue =3D new DoubleWritable(vertexId.get() * 10d);
=A0 =A0 =A0 long targetVertexId =3D
=
=A0 =A0 =A0 =A0 =A0 (vertexId.get()= + 1) %
=A0 =A0 =A0 =A0 =A0 (inputSplit.get= NumSplits() * totalRecords);
=A0 =A0 =A0 float edgeValue =3D vertexId.get() * 100f;

And in the Task Logs, it prints
2013-05-30 11:2=
8:20,808 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRan=
kVertexReader: next: Return vertexId=3D0, vertexValue=3D0.0, targetVertexId=
=3D1, edgeValue=3D0.0
2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D1, vertexValue=3D10.0=
, targetVertexId=3D2, edgeValue=3D100.0
2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D2, vertexValue=3D20.0=
, targetVertexId=3D3, edgeValue=3D200.0
2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D3, vertexValue=3D30.0=
, targetVertexId=3D4, edgeValue=3D300.0
2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D4, vertexValue=3D40.0=
, targetVertexId=3D5, edgeValue=3D400.0
2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D5, vertexValue=3D50.0=
, targetVertexId=3D6, edgeValue=3D500.0
2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D6, vertexValue=3D60.0=
, targetVertexId=3D7, edgeValue=3D600.0
2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D7, vertexValue=3D70.0=
, targetVertexId=3D8, edgeValue=3D700.0
2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D8, vertexValue=3D80.0=
, targetVertexId=3D9, edgeValue=3D800.0
2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVerte=
x$SimplePageRankVertexReader: next: Return vertexId=3D9, vertexValue=3D90.0=
, targetVertexId=3D0, edgeValue=3D900.0

Could someone explain why is this happening? =A0
Thanks!--
Maria Stylianou
Intern at Telefonica, Barcel= ona, Spain
--bcaec531470b7ca39a04dded5029--