Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B8528E7E6 for ; Thu, 14 Feb 2013 23:18:10 +0000 (UTC) Received: (qmail 61414 invoked by uid 500); 14 Feb 2013 23:18:10 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 61328 invoked by uid 500); 14 Feb 2013 23:18:10 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 61320 invoked by uid 99); 14 Feb 2013 23:18:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Feb 2013 23:18:10 +0000 X-ASF-Spam-Status: No, hits=-10.8 required=5.0 tests=ENV_AND_HDR_SPF_MATCH,HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS,USER_IN_DEF_SPF_WL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mitu@paypal.com designates 193.28.178.24 as permitted sender) Received: from [193.28.178.24] (HELO DUB-MIPOT-002.CORP.EBAY.COM) (193.28.178.24) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Feb 2013 23:18:02 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paypal.com; i=mitu@paypal.com; q=dns/txt; s=paypalcorp; t=1360883881; x=1392419881; h=from:to:subject:date:message-id:in-reply-to:mime-version; bh=wLNRD9vWRZ8tmlpCFZqB+nVD6Ld8FyxHNI+yMCOUusM=; b=Q/AHrk1b7sgQ2K9GaZtbevEsfZ2UlsGIhPK8sDedcygBxegFvIrR86Vk nv3+mjspDnj66PlgebeWucGw1c+1Tq4EBO3vhAN7yr1UNoq0jkI1lHKqc gonwn2RioPhjLUJ2VKvJX06u3QlIsr6u/S6hrTi2RUeSnVQVw2aet9foB U=; X-EBay-Corp: Yes X-IronPort-AV: E=Sophos;i="4.84,666,1355126400"; d="scan'208,217";a="10261021" Received: from rhv-mipot-001.corp.ebay.com ([10.112.116.228]) by DUB-MIPOT-002.CORP.EBAY.COM with ESMTP/TLS/RC4-SHA; 14 Feb 2013 15:17:40 -0800 DomainKey-Signature: s=paypalcorp; d=paypal.com; c=nofws; q=dns; h=X-EBay-Corp:X-IronPort-AV:Received:Received:From:To: Subject:Thread-Topic:Thread-Index:Date:Message-ID: In-Reply-To:Accept-Language:Content-Language: X-MS-Has-Attach:X-MS-TNEF-Correlator:x-originating-ip: Content-Type:MIME-Version:X-CFilter-Loop; b=We2BMieKx1LBS+tlkerD9jpKCDYR/btyf7syEAt5Z48PKEslkNZV6PmH eXiOfb/1mym/AEbjpAj8y28ygxC/ZY69aNmcCOwteLhRDYNop0gR9lucm HUVJqdQSCcJgaujVTITfNx0dlKNzGcagpXSO68AWyNUcWEOWlls8MmWZJ Y=; DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paypal.com; i=mitu@paypal.com; q=dns/txt; s=paypalcorp; t=1360883861; x=1392419861; h=from:to:subject:date:message-id:in-reply-to:mime-version; bh=wLNRD9vWRZ8tmlpCFZqB+nVD6Ld8FyxHNI+yMCOUusM=; b=sKC8E3wYYj77myGWp299P6c1lkg/4SdlF4gAdl5O+CHpKVNuBWtWrTQ1 p5fcNGykhSC2vGiEwAmkaKLjZ2PTBbynI+RxrYJX2DR1aJzlktPEdx0TS TuX3/FxGyqWeaYlIg4fd3eTaJ7GvZYizRMJqYDWnmWCvrdhh7iWdIVKUo c=; X-EBay-Corp: Yes X-IronPort-AV: E=Sophos;i="4.84,666,1355126400"; d="scan'208,217";a="106496081" Received: from rhv-mexcs-001.corp.ebay.com (HELO RHV-EXMHT-002.corp.ebay.com) ([10.112.113.54]) by rhv-mipot-001.corp.ebay.com with ESMTP; 14 Feb 2013 15:17:39 -0800 Received: from RHV-EXRDA-S11.corp.ebay.com ([fe80::edc0:9413:d700:64f]) by RHV-EXMHT-002.corp.ebay.com ([fe80::28ad:1e3c:98e1:6821%14]) with mapi id 14.02.0318.004; Thu, 14 Feb 2013 15:17:39 -0800 From: "Tu, Min" To: "user@giraph.apache.org" Subject: Re: General Scalability Questions for Giraph Thread-Topic: General Scalability Questions for Giraph Thread-Index: AQHOCwWzKuKXRuWlhkevKoSyhl2x5Jh6f9kA//99DYA= Date: Thu, 14 Feb 2013 23:17:38 +0000 Message-ID: <345801A3A7546D488A0CE4001E4FFB280838B112@RHV-EXRDA-S11.corp.ebay.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.245.27.241] Content-Type: multipart/alternative; boundary="_000_345801A3A7546D488A0CE4001E4FFB280838B112RHVEXRDAS11corp_" MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Virus-Checked: Checked by ClamAV on apache.org --_000_345801A3A7546D488A0CE4001E4FFB280838B112RHVEXRDAS11corp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Claudio, Thank you very much for your valuable inputs. I will follow your suggestion= s to try giraph 0.2 ( from trunk ) and the workers setting. Min From: Claudio Martella > Reply-To: "user@giraph.apache.org" > Date: Thursday, February 14, 2013 3:06 PM To: "user@giraph.apache.org" > Subject: Re: General Scalability Questions for Giraph Hi Tu, first of all, I really suggest you run trunk, especially if you have a larg= e graph. That being said: 1) yes and no, the jargon is misleading. you should have n - 1 workers (wha= t you call mappers for giraph job) with n as the max number of mappers you = can have in your cluster as an upper limit (the additional 1 goes for the m= aster). In general, i'd strongly suggest you have 1 mapper/worker per node/= MACHINE, and k compute threads per worker, with k as the number of cores on= that machine. You'll save netty sending messages over the loopback and add= itional jvm overhead. 2) yes, but I challenge you to compute those sizes before hand :) Also cons= ider the size of the messages being produced by your algorithm. E.g. roughl= y, PageRank produces a double for each edge in the graph, during each super= step. 3) AFAIK there's no way, but I might be wrong here. 4) I'd suggest you also talk in terms of nodes. Having multiple workers per= machine misleads the scalability on certain aspects (such as network i/o).= I have been running Giraph jobs on hundreds of mappers and around 65 machi= nes. I know others here have done bigger numbers (~300 workers). I'd say th= e upper limit to scalability is your main memory ATM, so you might want to = have a look at out-of-core graph and messages. Hope it helps, Claudio On Thu, Feb 14, 2013 at 11:50 PM, Tu, Min > wrote: Hi, I have some general scalability questions for Giraph. Based on the Giraph d= esign, I am assuming all the mappers in giraph job should be running at the= same time. If so, then 1. The max mappers for giraph job <=3D total mapper slots in the whole c= luster 2. The max data input size to giraph should be <=3D total mapper slots *= mapper memory limit 3. If the total mapper slot in the cluster is 200 and only 100 mappers i= s currently available, and the giraph job require 150 mappers * Without any configuration change, the 100 mappers of the giraph wi= ll be started but the giraph job will NOT run successfully * Is there any configuration in Giraph to start the job ONLY at them= time when all the mapper slot available? 4. How is the scalability in giraph? I can ONLY run up to 150 mappers fo= r my giraph job. Does anyone run a large giraph job in large cluster succes= sfully? * I am using giraph 0.1 in my cluster Thanks a lot for your time and inputs. Min -- Claudio Martella claudio.martella@gmail.com --_000_345801A3A7546D488A0CE4001E4FFB280838B112RHVEXRDAS11corp_ Content-Type: text/html; charset="us-ascii" Content-ID: <6E87923E08244E48AF61C2E150F06B09@corp.ebay.com> Content-Transfer-Encoding: quoted-printable
Hi Claudio,

Thank you very much for your valuable inputs. I will follow your sugge= stions to try giraph 0.2 ( from trunk ) and the workers setting.

Min

From: Claudio Martella <claudio.martella@gmail.com> Reply-To: "user@giraph.apache.org" <user@giraph.apache.org>
Date: Thursday, February 14, 2013 3= :06 PM
To: "user@giraph.apache.org" <user@giraph.apache.org>
Subject: Re: General Scalability Qu= estions for Giraph

Hi Tu,

first of all, I really suggest you run trunk, especially if you have a= large graph. That being said:

1) yes and no, the jargon is misleading. you should have n = - 1 workers (what you call mappers for giraph job) with n as the max number= of mappers you can have in your cluster as an upper limit (the additional = 1 goes for the master). In general, i'd strongly suggest you have 1 mapper/worker per node/MACHINE, and k comp= ute threads per worker, with k as the number of cores on that machine. You'= ll save netty sending messages over the loopback and additional jvm overhea= d.

2) yes, but I challenge you to compute those sizes before h= and :) Also consider the size of the messages being produced by your algori= thm. E.g. roughly, PageRank produces a double for each edge in the graph, d= uring each superstep.

3) AFAIK there's no way, but I might be wrong here.

4) I'd suggest you also talk in terms of nodes. Having mult= iple workers per machine misleads the scalability on certain aspects (such = as network i/o). I have been running Giraph jobs on hundreds of mappers and= around 65 machines. I know others here have done bigger numbers (~300 workers). I'd say the upper limit to s= calability is your main memory ATM, so you might want to have a look at out= -of-core graph and messages.

Hope it helps,
Claudio
--_000_345801A3A7546D488A0CE4001E4FFB280838B112RHVEXRDAS11corp_--