Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3ABC710EF3 for ; Sat, 4 Jan 2014 01:19:54 +0000 (UTC) Received: (qmail 71516 invoked by uid 500); 4 Jan 2014 01:19:54 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 71438 invoked by uid 500); 4 Jan 2014 01:19:54 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 71430 invoked by uid 99); 4 Jan 2014 01:19:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Jan 2014 01:19:53 +0000 X-ASF-Spam-Status: No, hits=2.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLYTO_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [98.138.91.186] (HELO nm26-vm4.bullet.mail.ne1.yahoo.com) (98.138.91.186) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 04 Jan 2014 01:19:46 +0000 Received: from [98.138.100.118] by nm26.bullet.mail.ne1.yahoo.com with NNFMP; 04 Jan 2014 01:19:25 -0000 Received: from [98.138.87.4] by tm109.bullet.mail.ne1.yahoo.com with NNFMP; 04 Jan 2014 01:19:25 -0000 Received: from [127.0.0.1] by omp1004.mail.ne1.yahoo.com with NNFMP; 04 Jan 2014 01:19:24 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 988523.80544.bm@omp1004.mail.ne1.yahoo.com Received: (qmail 39984 invoked by uid 60001); 4 Jan 2014 01:19:24 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1388798364; bh=asU3gjBInzTAypaJpK0G2JZnjuaeatTSrLg0gvVsEtU=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=Z7IDl95gOdERWsCTk9XNsYOh+SEi/b2CwrLn2oDH2wK9mCv/B+3C9oHrHZ/09T+4pj1OB5sGhYWWnZWJVaopfiIQxs37B//pBJ3b5gRpotP+UMZzBIRK0naWgCWCUc+pBNcSAty/2NnRRC+M2mpCC9+d3VNtLBeNu/PjNScXL+U= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=S/VepI/zsKu6yOGcIqO4lzBA/hu/DN3j0ZZiLCpAThGdOUC8X8daYfrPRovhchtDvfmzr94UBuCVRs3IUs2tv+6LDjMI2VztEQtfQRkAIluBffdKB8h8ni90aWnOQMS5kLPVuQcbg2yu84LC7L4a0Vz4cbp6KMnIZXcay6fE7fY=; X-YMail-OSG: h_k8.swVM1lGpBDtMWv90vMkFy9WpxaED55r5ATTAgp76S2 rEjdYSiCG8Csr2rs.x5QWgIaDF5nF2DbCLy3QAezU3EQ4.wIzUgbJw3T9kjY epxrPN7ktu9isvJif5BHTbnoznBvTzsSjZaoPC.xdjm3zaDiTbvNC9hza.oT 0V7AegzUyX70AHtU6eUHhYvpICGEKIo23brf6ESYPCbsbZYfyGk3pdcA6d4u _f_y_zpXhm78g5p2kNH1m.jQ9RXmCAsBH_447ipQwNbnu248LQrhKUYH.c8f t0HDVQovAHCvHAart3fsd7HxwEEEO.98GYhbHxPhHo2yj31a2LxcXEHzYG3T cIOkeEGBlmA0000_12C0VDtuYYtLlmWKIRhi2uRZxk2rOInCOdsVct0mxIXh xRLInsupTADal3rBzyoaNvqZwWcNjQ6otqLuoT3oz9je7Npe6QoBqM2EurFj WEGgnRH80Jz1MV44Ov.C0eL98MziFbYY5VAKqYdj81RXTlpU9YFvsPJ5YGnr ciUVEUpwjMGVAlrl9DuZg_3jpL8o3luqbEDVIsYiB2ZoZ4cK47vp_DiY4FD3 pd6hjw1_MAQkNsgp16u2.b4OXD7oYgoVQBsnWK_A65ow7G0RTk1LqHnTialt UR.c9cMdw03JYIy0aD9nuUOo4jpGMRW7W7xKrDa2UKOG7_ouDNg-- Received: from [69.28.149.129] by web121702.mail.ne1.yahoo.com via HTTP; Fri, 03 Jan 2014 17:19:24 PST X-Rocket-MIMEInfo: 002.001,SSBhZ3JlZSB3aXRoIEVyaWMgYW5kIFJhZmFsJ3MgYXNzZXNzbWVudC4gWW91IGNhbiBzZW5kIGEgcGF0Y2ggYWJvdXQgdGhpcy4KClJlZ2FyZHMsCk1vaGFtbWFkCgoKCk9uIEZyaWRheSwgSmFudWFyeSAzLCAyMDE0IDE6MTMgUE0sIFJhZmFsIFdvamR5bGEgPHJhdndvamR5bGFAZ21haWwuY29tPiB3cm90ZToKIApJIGhhZCB0aGUgc2FtZSBwcm9ibGVtLCBhbmQgSSBoYXZlIGNyZWF0ZWQgdGhpcyB0aWNrZXQ6wqBodHRwczovL2lzc3Vlcy5hcGFjaGUub3JnL2ppcmEvYnJvd3NlL0dJUkFQSC04MTkKRmlyc3QBMAEBAQE- X-Mailer: YahooMailWebService/0.8.172.614 References: <48203FF7-BF5F-4410-82FD-7C636FF4DB3C@gmail.com> Message-ID: <1388798364.40509.YahooMailNeo@web121702.mail.ne1.yahoo.com> Date: Fri, 3 Jan 2014 17:19:24 -0800 (PST) From: Mohammad Islam Reply-To: Mohammad Islam Subject: Re: yarn container accounting errors. To: "user@giraph.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="896913286-1545881775-1388798364=:40509" X-Virus-Checked: Checked by ClamAV on apache.org --896913286-1545881775-1388798364=:40509 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable I agree with Eric and Rafal's assessment. You can send a patch about this.= =0A=0ARegards,=0AMohammad=0A=0A=0A=0AOn Friday, January 3, 2014 1:13 PM, Ra= fal Wojdyla wrote:=0A =0AI had the same problem, and= I have created this ticket:=A0https://issues.apache.org/jira/browse/GIRAPH= -819=0AFirst I thought it's just operator problem, then I got to the same c= onclusion as you, and removed the if check.=0ABtw - what was the purpose of= that check, in the same function we check whether there is enough memory a= vailable. Memory check still doesn't guaranty that the job will run though,= imagine 10 really small node manager (512MB) in total 5120MB, and giraph a= pplication that requires 2 containers 1GB each.=0A=0A=0A=0AOn Fri, Jan 3, 2= 014 at 9:47 PM, Eric Kimbrel wrote:=0A=0AI am running= on hadoop 2.2.0-cdh5.0.0-beta-1 with pure yarn mode. =A0 I =3D=0A>have not= iced two issues in org.apache.giraph.yarn.GiraphYarnClient=0A>=0A>giraph co= de base is 1.1.0-SNAPSHOT, downloaded on Jan 2, 2014=0A>=0A>looking at the = checkPerNodeResourcesAvailable Method.=0A>=0A>The first issue is that the n= umber of containers available is calculated =3D=0A>using node.getNumContain= ers(). =A0Looking at the yarn documentation this =3D=0A>is the number of co= ntainers currently running on the node. =A0So with a =3D=0A>yarn cluster wi= th no jobs running all nodes report 0 containers.=0A>=0A>=0A>The second iss= ue (in the same method) is this if block:=0A>=0A>if (workers < numContainer= s) {=0A>=A0 =A0 =A0throw new RuntimeException("Giraph job requires " + work= ers +=0A>=A0 =A0 =A0 =A0" containers to run; cluster only hosts " + numCont= ainers);=0A>=0A>}=0A>=0A>So with the current set up if a cluster has 4 cont= ainers currently =3D=0A>running and a graph job is submitted that requires = 2 containers the job =3D=0A>will fail saying =3D93Giraph job requires 2 con= tainers to run; cluster =3D=0A>only hosts 4=3D94.=0A>=0A>the if statement s= hould be =3D93 workers > numContainers=3D94 =A0and =3D=0A>numContainers sho= uld reflect the total number of containers available, =3D=0A>not the number= of containers currently running. =A0I don=3D92t know yarn =3D=0A>well so i= don=3D92t know if such a number is available at all.=0A>=0A>For the time b= eing i plan on getting this working by bypassing the check =3D=0A>all toget= her.=0A> --896913286-1545881775-1388798364=:40509 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
I agree wi= th Eric and Rafal's assessment. You can send a patch about this.

Regards,
Mohammad


On Friday, January 3, 2014 1:13 PM, Rafal Wojdyla <ravwojdyla@gmail.co= m> wrote:
I had the same problem, and I have cre= ated this ticket: https://issues.a= pache.org/jira/browse/GIRAPH-819
First I thought it's just operator= problem, then I got to the same conclusion as you, and removed the if chec= k.
=0A
Btw - what was the purpose of that check, in the same funct= ion we check whether there is enough memory available. Memory check still d= oesn't guaranty that the job will run though, imagine 10 really small node = manager (512MB) in total 5120MB, and giraph application that requires 2 con= tainers 1GB each.
=0A


On F= ri, Jan 3, 2014 at 9:47 PM, Eric Kimbrel <lekimbrel@gmail.com>= wrote:
=0A
I= am running on hadoop 2.2.0-cdh5.0.0-beta-1 with pure yarn mode.   I = =3D
=0Ahave noticed two issues in org.apache.giraph.yarn.= GiraphYarnClient
=0A
=0Agiraph code bas= e is 1.1.0-SNAPSHOT, downloaded on Jan 2, 2014
=0A
=0Alooking at the checkPerNodeResourcesAvailable Method.
=0A
=0AThe first issue is that the number of = containers available is calculated =3D
=0Ausing node.getN= umContainers().  Looking at the yarn documentation this =3D
=0Ais the number of containers currently running on the node. &nb= sp;So with a =3D
=0Ayarn cluster with no jobs running all= nodes report 0 containers.
=0A
=0A
=0AThe second issue (in the same method) is this if block:=0A
=0Aif (workers < numContainers) {=
=0A     throw new RuntimeException("Girap= h job requires " + workers +
=0A      &nbs= p;" containers to run; cluster only hosts " + numContainers);
=0A
=0A}
=0A
=0A= So with the current set up if a cluster has 4 containers currently =3D
=0Arunning and a graph job is submitted that requires 2 conta= iners the job =3D
=0Awill fail saying =3D93Giraph job req= uires 2 containers to run; cluster =3D
=0Aonly hosts 4=3D= 94.
=0A
=0Athe if statement should be = =3D93 workers > numContainers=3D94  and =3D
=0Anu= mContainers should reflect the total number of containers available, =3D=0Anot the number of containers currently running.  I = don=3D92t know yarn =3D
=0Awell so i don=3D92t know if su= ch a number is available at all.
=0A
= =0AFor the time being i plan on getting this working by bypassing the check= =3D
=0Aall together.
=0A
<= /div>


--896913286-1545881775-1388798364=:40509--