Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9BF64D3AE for ; Mon, 2 Jul 2012 08:00:54 +0000 (UTC) Received: (qmail 40740 invoked by uid 500); 2 Jul 2012 08:00:53 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 40670 invoked by uid 500); 2 Jul 2012 08:00:53 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 40653 invoked by uid 99); 2 Jul 2012 08:00:53 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 08:00:53 +0000 Received: from localhost (HELO achingmbp15.local) (127.0.0.1) (smtp-auth username aching, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Jul 2012 08:00:53 +0000 Message-ID: <4FF15534.5060906@apache.org> Date: Mon, 02 Jul 2012 01:00:52 -0700 From: Avery Ching User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 MIME-Version: 1.0 To: dev@giraph.apache.org Subject: Re: How does scaling work in Giraph? References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Praveen, response inline. Hope it's helpful. On 6/30/12 10:47 AM, Praveen Sripati wrote: > Could someone respond to the below mail please? > > Thanks, > Praveen > > On Thu, Jun 28, 2012 at 7:04 PM, Praveen Sripati > wrote: > >> During the 24th minute of the recent Hadoop Summit Video [1] Avery Ching >> talks about how Giraph is made scalable. I am interested in Hama which is >> also based on the BSP model and would like to know more details on how >> Giraph is made scalable. >> >> Basically, at the end of each super step, the BSP tasks sends some metrics >> to the master and the master partitions the data in the most loaded BSP >> tasks and uses the free map available slot to process them. >> >> 1) Where is the code for the above logic? I am new to Giraph. See BspWorker#finishSuperstep() >> 2) What is the logic behind the partitioning of the data in the master >> after the super step? Let's say that the data has been partitioned using >> Hash partitioning. See GraphPartitionerFactory >> 3) Similarly will Giraph also scale down? Will the partitions be merged? This is totally up to the implementation of GraphPartitionerFactory. >> Thanks, >> Praveen >> >> [1] - http://www.youtube.com/watch?v=b5Qmz4zPj-M >>