Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C4E6F1053F for ; Fri, 20 Sep 2013 20:20:23 +0000 (UTC) Received: (qmail 43428 invoked by uid 500); 20 Sep 2013 20:20:22 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 43307 invoked by uid 500); 20 Sep 2013 20:20:17 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 43299 invoked by uid 99); 20 Sep 2013 20:20:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Sep 2013 20:20:15 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kumbhare@usc.edu designates 209.85.192.169 as permitted sender) Received: from [209.85.192.169] (HELO mail-pd0-f169.google.com) (209.85.192.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Sep 2013 20:20:10 +0000 Received: by mail-pd0-f169.google.com with SMTP id r10so808846pdi.14 for ; Fri, 20 Sep 2013 13:19:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=sNvGf7l8Zcezqs/ZD/z8NbvjxnQ3sxccBx8YM1HSJTU=; b=KLUkvrK14izlFPupIvq23JQAQ81B/9oWOagQK4/toXEvfYxLWdNpeAUNq5C1F053GK +A/pDoz8+aPUoGS+cqBn6um/VmuECnnuERGwiBDGLYf0eOYs+r9sLvypXIFRzawHCZi/ +XlsWI/V0gLbhZuC0XxE3eJvTnmSyOdLQMca1/fSiCij5iejwXmkpG672wJSGtRl2KFl 3WPcb2PI7d7s1fMZcn5XKfdj/WgMQnXls79uwHUpJSZFQ09hGRZty17mhlDffUxGmit6 BFEFqjvaB4yrlsq5v9R0F9HP9wk4FtODa31VK1ftCQCGpcNaFLAnLuJpkwguKOTH6eOo /1Mw== X-Gm-Message-State: ALoCoQmrzjg223jsNG+x90V4bvEqzxyBqiIn3edn6Mv6PKAtWAlFpfq41/F4rWB1xmaknbY74i8f MIME-Version: 1.0 X-Received: by 10.66.19.137 with SMTP id f9mr10880079pae.138.1379708389582; Fri, 20 Sep 2013 13:19:49 -0700 (PDT) Received: by 10.69.1.106 with HTTP; Fri, 20 Sep 2013 13:19:49 -0700 (PDT) Date: Fri, 20 Sep 2013 13:19:49 -0700 Message-ID: Subject: Multi-threading in giraph (Exception while using giraph.userPartitionCount) From: Alok Kumbhare To: user@giraph.apache.org Content-Type: multipart/alternative; boundary=bcaec520e97b75375e04e6d66447 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec520e97b75375e04e6d66447 Content-Type: text/plain; charset=ISO-8859-1 Hi, I am trying to run multi-threaded giraph workers. This is the command that i use: hadoop jar giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedComponentsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip in/road-template -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op out/cc_mt_road4 -w 24 -ca giraph.numComputeThreads=4,giraph.userPartitionCount=4 We have a 12 node cluster with 8 cores each. I am running 24 workers and wish to run each worker in a multi-threaded way so that multiple vertices are processed in parallel on a single node. I read in a different thread that suggested to use userPartitionCount= so that each thread works on a different partition. However when i do that, i get the following exception ava.lang.IllegalStateException: run: Caught an unrecoverable exception null at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.NullPointerException at org.apache.giraph.comm.SendCache.(SendCache.java:100) at org.apache.giraph.comm.SendEdgeCache.(SendEdgeCache.java:50) at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.(NettyWorkerClientRequestProcessor.java:128) at org.apache.giraph.worker.InputSplitsCallable.(InputSplitsCallable.java:104) at org.apache.giraph.worker.VertexInputSplitsCallable.(VertexInputSplitsCallable.java:98) at org.apache.giraph.worker.VertexInputSplitsCallableFactory.newCallable(VertexInputSplitsCallableFactory.java:80) at org.apache.giraph.worker.VertexInputSplitsCallableFactory.newCallable(VertexInputSplitsCallableFactory.java:37) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:213) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:283) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:327) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:508) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:246) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) ... 7 more when i run the command without giraph.userPartitionCount=4 but specify just -ca giraph.numComputeThreads=4, i dont see any performance improvement. Please suggest the correct way to use multi threading or point me to a document. Thanks, Alok Kumbhare --bcaec520e97b75375e04e6d66447 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,
I am trying to run multi-threaded girap= h workers. This is the command that i use:
=A0
hadoop j= ar giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.ja= r org.apache.giraph.GiraphRunner org.apache.giraph.examples.ConnectedCompon= entsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDouble= VertexInputFormat -vip in/road-template -vof org.apache.giraph.io.formats.I= dWithValueTextOutputFormat -op out/cc_mt_road4 -w 24 -ca giraph.numComputeT= hreads=3D4,giraph.userPartitionCount=3D4
=A0
We have a 12 node cluster with 8 cores each. I am runnin= g 24 workers and wish to run each worker in a multi-threaded way so that mu= ltiple vertices are processed in parallel on a single node.
=A0
I read in a different thread that suggested to use userPartitionCount= =3D<threadcount> so that each thread works on a different partition.<= /div>
=A0
However when i do that, i get the following excepti= on
ava.lang.IllegalStateException: run: Caught an unrecoverable exception= null
=A0at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101= )
=A0at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)<= br> =A0at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
=A0at org.a= pache.hadoop.mapred.Child$4.run(Child.java:255)
=A0at java.security.Acce= ssController.doPrivileged(Native Method)
=A0at javax.security.auth.Subje= ct.doAs(Subject.java:396)
=A0at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInforma= tion.java:1190)
=A0at org.apache.hadoop.mapred.Child.main(Child.java:249= )
Caused by: java.lang.NullPointerException
=A0at org.apache.giraph.c= omm.SendCache.<init>(SendCache.java:100)
=A0at org.apache.giraph.comm.SendEdgeCache.<init>(SendEdgeCache.java:= 50)
=A0at org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor= .<init>(NettyWorkerClientRequestProcessor.java:128)
=A0at org.apac= he.giraph.worker.InputSplitsCallable.<init>(InputSplitsCallable.java:= 104)
=A0at org.apache.giraph.worker.VertexInputSplitsCallable.<init>(Verte= xInputSplitsCallable.java:98)
=A0at org.apache.giraph.worker.VertexInput= SplitsCallableFactory.newCallable(VertexInputSplitsCallableFactory.java:80)=
=A0at org.apache.giraph.worker.VertexInputSplitsCallableFactory.newCallable= (VertexInputSplitsCallableFactory.java:37)
=A0at org.apache.giraph.utils= .ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:213)
=A0at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceW= orker.java:283)
=A0at org.apache.giraph.worker.BspServiceWorker.loadVert= ices(BspServiceWorker.java:327)
=A0at org.apache.giraph.worker.BspServic= eWorker.setup(BspServiceWorker.java:508)
=A0at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.jav= a:246)
=A0at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91= )
=A0... 7 more
=A0
when i run the command without g= iraph.userPartitionCount=3D4 but specify just -ca giraph.numComputeThreads= =3D4, i dont see any performance improvement.
=A0
Please=A0suggest the correct way to use multi threading = or point me to a document.
=A0
Thanks,
Alok K= umbhare
--bcaec520e97b75375e04e6d66447--