Return-Path: X-Original-To: apmail-incubator-giraph-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-giraph-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9C3339276 for ; Fri, 10 Feb 2012 21:41:34 +0000 (UTC) Received: (qmail 48654 invoked by uid 500); 10 Feb 2012 21:41:34 -0000 Delivered-To: apmail-incubator-giraph-user-archive@incubator.apache.org Received: (qmail 48553 invoked by uid 500); 10 Feb 2012 21:41:33 -0000 Mailing-List: contact giraph-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: giraph-user@incubator.apache.org Delivered-To: mailing list giraph-user@incubator.apache.org Received: (qmail 48545 invoked by uid 99); 10 Feb 2012 21:41:33 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Feb 2012 21:41:33 +0000 Received: from localhost (HELO mizamordulu-mbp.thefacebook.com) (127.0.0.1) (smtp-auth username aching, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Feb 2012 21:41:33 +0000 Message-ID: <4F358F0D.7080306@apache.org> Date: Fri, 10 Feb 2012 13:41:33 -0800 From: Avery Ching User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:10.0) Gecko/20120129 Thunderbird/10.0 MIME-Version: 1.0 To: giraph-user@incubator.apache.org Subject: Re: question about vertex instantiation location. . . References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit By default, you are using the HashPartitionerFactory. This will create the partitions ahead of time and balance them equally by count to the workers. Therefore, assuming you have a uniform distribution across the VertexId space, the graph should be balanced across the workers evenly according the number of vertices. If you look at PartitionBalancer, you can try to rebalance the graph if you like as it is running. This is a bit experimental, but should work. The choices for balancing are (no balancing, balance by edges or balance by vertices). Hope that helps, Avery On 2/10/12 1:25 PM, David Garcia wrote: > Hey guys. . .I have a questions about "dynamic" vertex instantiation vis > the sendMsg(. . .) method. I have a job that starts processing on a > sequenceFile with only two vertices in it. Each vertex has information in > it's value that tells it what vertices are adjacent to it. The primary > reason I'm doing this is to avoid loading the entire graph into the job. > There are many vertices that won't do any processing (no need to load > them). I would like to take my two vertices and "dynamically" build the > graph by sending messages. So far, my experimentation shows that this is > promising. . .but I have a question WRT load balancing for new vertex > instantiation. When I call sendMsg(newVertexID), where will the vertex be > instantiated? If I specify 20 mappers (but with only two vertices in my > sequence file), obviously there is going to be at least one mapper without > a vertex. Is it possible that sendMsg(newVertexID) will be instantiated > on an empty mapper? I would like this. . .for load balancing purposes. > > -david >