Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EE519D7D4 for ; Sat, 18 Aug 2012 05:09:48 +0000 (UTC) Received: (qmail 19406 invoked by uid 500); 18 Aug 2012 05:09:48 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 19124 invoked by uid 500); 18 Aug 2012 05:09:41 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 19061 invoked by uid 500); 18 Aug 2012 05:09:39 -0000 Delivered-To: apmail-incubator-giraph-dev@incubator.apache.org Received: (qmail 19046 invoked by uid 99); 18 Aug 2012 05:09:38 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 18 Aug 2012 05:09:38 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id F284C2C5BE2 for ; Sat, 18 Aug 2012 05:09:37 +0000 (UTC) Date: Sat, 18 Aug 2012 16:09:37 +1100 (NCT) From: "Eli Reisman (JIRA)" To: giraph-dev@incubator.apache.org Message-ID: <1186572546.26226.1345266577994.JavaMail.jiratomcat@arcas> In-Reply-To: <601719947.8629.1344968258053.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (GIRAPH-301) InputSplit Reservations are clumping, leaving many workers asleep while other process too many splits and get overloaded. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/GIRAPH-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Reisman updated GIRAPH-301: ------------------------------- Attachment: GIRAPH-301-3.patch I think this approach might bear more fruit than the others, as it worked before in 250 and I probably should have incorporated into the locality patch already. If this reduces the # of reads required for a worker to find an unclaimed split on the first round of iterations, the clumping problem should be solved, and the ZK writes that begin to pile up at scale will not slow down the reads so much that so many workers never make their way through the whole list. I'll report back as to the success or failure of the tests by monday. > InputSplit Reservations are clumping, leaving many workers asleep while other process too many splits and get overloaded. > ------------------------------------------------------------------------------------------------------------------------- > > Key: GIRAPH-301 > URL: https://issues.apache.org/jira/browse/GIRAPH-301 > Project: Giraph > Issue Type: Improvement > Components: bsp, graph, zookeeper > Affects Versions: 0.2.0 > Reporter: Eli Reisman > Assignee: Eli Reisman > Labels: patch > Fix For: 0.2.0 > > Attachments: GIRAPH-301-1.patch, GIRAPH-301-2.patch, GIRAPH-301-3.patch > > > With recent additions to the codebase, users here have noticed many workers are able to load input splits extremely quickly, and this has altered the behavior of Giraph during INPUT_SUPERSTEP when using the current algorithm for split reservations. A few workers process multiple splits (often overwhelming Netty and getting GC errors as they attempt to offload too much data too quick) while many (often most) of the others just sleep through the superstep, never successfully participating at all. > Essentially, the current algo is: > 1. scan input split list, skipping nodes that are marked "Finsihed" > 2. grab the first unfinished node in the list (reserved or not) and check its reserved status. > 3. if not reserved, attempt to reserve & return it if successful. > 4. if the first one you check is already taken, sleep for way too long and only wake up if another worker finishes a split, then contend with that worker for another split, while the majority of the split list might sit idle, not actually checked or claimed by anyone yet. > This does not work. By making a few simple changes (and acknowledging that ZK reads are cheap, only writes are not) this patch is able to get every worker involved, and keep them in the game, ensuring that the INPUT_SUPERSTEP passes quickly and painlessly, and without overwhelming Netty by spreading the memory load the split readers bear more evenly. If the giraph.splitmb and -w options are set correctly, behavior is now exactly as one would expect it to be. > This also results in INPUT_SUPERSTEP passing more quickly, and survive the INPUT_SUPERSTEP for a given data load on less Hadoop memory slots. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira