Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6521810D6F for ; Sat, 15 Feb 2014 19:33:01 +0000 (UTC) Received: (qmail 35087 invoked by uid 500); 15 Feb 2014 19:32:59 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 35002 invoked by uid 500); 15 Feb 2014 19:32:58 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 34993 invoked by uid 99); 15 Feb 2014 19:32:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Feb 2014 19:32:58 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of claudio.martella@gmail.com designates 209.85.216.41 as permitted sender) Received: from [209.85.216.41] (HELO mail-qa0-f41.google.com) (209.85.216.41) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Feb 2014 19:32:52 +0000 Received: by mail-qa0-f41.google.com with SMTP id w8so20285115qac.0 for ; Sat, 15 Feb 2014 11:32:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=OMp0ANvYbRusyvBuGnTMoKBEYq8r8eCWmLmx1WcQl2g=; b=BHh46XuTtX40Xz8nzYPuJtmqciXO7yyCtXaarAnJvDKn4JVRNq2jflKA5rTZeoNqqN G9IXcqzeCnYNNgZLdhbISxi6UaliA1NbCnn9IijaD+ZZP+0A9S7gzyELXfuWv2SyXx/7 bJQPn1vHtzSKUu7YewkcZ7umLUi4sAHP4G6ZRhkYfT9sM6dhPVlKdodIziV4SIkhMcTb rZ63dikzg+/pOgbtoG5qF9/BFhuX/76zwJi79wHgTzq8xoFbozdDXcpTeGn1jnBSkoDu o+9s6WV2GxNEfElNEeVmEmdjYN28qdM82f+3zuJQpjMTdt3JZsmoxRl4DbKx8MI+EUKf KO5Q== X-Received: by 10.140.97.73 with SMTP id l67mr22566007qge.56.1392492752252; Sat, 15 Feb 2014 11:32:32 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.48.175 with HTTP; Sat, 15 Feb 2014 11:32:12 -0800 (PST) In-Reply-To: <52FD3026.3060403@apache.org> References: <52FAB4BE.2000005@apache.org> <52FB38C4.5050002@apache.org> <20140212115025.GA692@imap.vu.nl> <20140212115318.GB692@imap.vu.nl> <52FB6266.9030202@apache.org> <52FB7E38.80102@apache.org> <20140212152125.GA686@imap.vu.nl> <52FD3026.3060403@apache.org> From: Claudio Martella Date: Sat, 15 Feb 2014 20:32:12 +0100 Message-ID: Subject: Re: GIRAPH-825 and GIRAPH-840 To: "dev@giraph.apache.org" , Sebastian Schelter Cc: Armando Miraglia Content-Type: multipart/alternative; boundary=001a113a9cc4d9b77a04f276fb09 X-Virus-Checked: Checked by ClamAV on apache.org --001a113a9cc4d9b77a04f276fb09 Content-Type: text/plain; charset=ISO-8859-1 Sebastian, I had a look at your vertexinputformat. I think there might be a bug. Why are you caching/reusing the id? This way every vertex parsed by the vertexreader will share the same ID object, and hence have the same ID. I think this is broken. you should instantiate a new ID object in the preprocessLine. Can you try like that? On Thu, Feb 13, 2014 at 9:50 PM, Sebastian Schelter wrote: > Hi Armando, > > I uploaded my test code to github at: > > https://github.com/sscdotopen/giraph/tree/hyperball64-ooc > > I'm working on an algorithm to estimate the neighborhood function of the > graph (similar to [1]). I'm running this on the transposed adjacency matrix > of a snapshot of the twitter follower graph [2]. For this graph out-of-core > is not necessary, but I would like to run my algorithm on another larger > graph that doesn't fit into the aggregated main memory of the cluster > anymore. > > I think for testing purposes, you can run it on any large graph in > adjacency form. > > Our cluster consists of 25 machines with 32GB ram, 8 cores and 4 disks per > machine. I use the following options to run the algorithm: > > hadoop jar giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar > org.apache.giraph.GiraphRunner > > org.apache.giraph.examples.hyperball.HyperBall > > --vertexInputFormat org.apache.giraph.examples.hyperball. > HyperBallTextInputFormat > > --vertexInputPath hdfs:///ssc/twitter-negative/ > > --vertexOutputFormat org.apache.giraph.io.formats. > IdWithValueTextOutputFormat > > --outputPath hdfs:///ssc/tmp-123/ > > --combiner org.apache.giraph.comm.messages.HyperLogLogCombiner > > --outEdges org.apache.giraph.edge.LongNullArrayEdges > > --workers 24 > > --customArguments > > giraph.oneToAllMsgSending=true, > giraph.isStaticGraph=true, > giraph.numComputeThreads=15, > giraph.numInputThreads=15, > giraph.numOutputThreads=15, > giraph.maxNumberOfSupersteps=30, > giraph.useOutOfCoreGraph=true, > giraph.maxPartitionsInMemory=20 > > Best, > Sebastian > > [1] http://arxiv.org/abs/1308.2144 > [2] http://konect.uni-koblenz.de/networks/twitter_mpi > > > On 02/12/2014 04:21 PM, Armando Miraglia wrote: > >> >> Hi Sebastian, >> >> On Wed, Feb 12, 2014 at 02:59:20PM +0100, Sebastian Schelter wrote: >> >>> No. Should I have done that? >>> >> >> could you please provide me with the test you have done together with >> the variables that you have set during for the computation? This would >> help me a lot. >> >> Cheers, >> Armando >> >> > -- Claudio Martella --001a113a9cc4d9b77a04f276fb09--