Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 68525 invoked from network); 28 Oct 2008 22:51:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 28 Oct 2008 22:51:29 -0000 Received: (qmail 26345 invoked by uid 500); 28 Oct 2008 22:51:29 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 26303 invoked by uid 500); 28 Oct 2008 22:51:28 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 26292 invoked by uid 99); 28 Oct 2008 22:51:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Oct 2008 15:51:28 -0700 X-ASF-Spam-Status: No, hits=-4.0 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of doug@conviva.com designates 216.82.254.51 as permitted sender) Received: from [216.82.254.51] (HELO mail144.messagelabs.com) (216.82.254.51) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 28 Oct 2008 22:50:13 +0000 X-VirusChecked: Checked X-Env-Sender: doug@conviva.com X-Msg-Ref: server-6.tower-144.messagelabs.com!1225234252!90125294!1 X-StarScan-Version: 5.5.12.14.2; banners=-,-,- X-Originating-IP: [216.38.138.37] Received: (qmail 32086 invoked from network); 28 Oct 2008 22:50:52 -0000 Received: from sam1mtai101.rinera.com (HELO mtai102.west.rinera.com) (216.38.138.37) by server-6.tower-144.messagelabs.com with SMTP; 28 Oct 2008 22:50:52 -0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mtai102.west.rinera.com (Postfix) with ESMTP id 4CC84670049 for ; Tue, 28 Oct 2008 15:50:52 -0700 (PDT) X-Virus-Scanned: amavisd-new at X-Spam-Score: -4.399 X-Spam-Level: Received: from mtai102.west.rinera.com ([127.0.0.1]) by localhost (mtai102.west.rinera.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id J4CLEsMRl7Zk for ; Tue, 28 Oct 2008 15:50:45 -0700 (PDT) Received: from [172.16.22.242] (unknown [172.16.22.242]) by mtai102.west.rinera.com (Postfix) with ESMTP id 056B6670046 for ; Tue, 28 Oct 2008 15:50:44 -0700 (PDT) Message-Id: <74FC3A43-A7A9-42A4-A988-30F116880368@conviva.com> From: Doug Balog To: core-user@hadoop.apache.org In-Reply-To: <623d9cf40810281449h58f29fa7n7a883ddd9982a0fa@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Re: Why separate Map/Reduce task limits per node ? Date: Tue, 28 Oct 2008 18:50:43 -0400 References: <8BA46231-42AB-4524-B35D-752FEEF09C46@conviva.com> <623d9cf40810271526h5e315eadn476501387efeba9c@mail.gmail.com> <39AA7D48-EA2B-44DA-89A7-FA3A86F20DBF@conviva.com> <623d9cf40810281449h58f29fa7n7a883ddd9982a0fa@mail.gmail.com> X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Flag: NO X-Old-Spam-Status: No, score=-4.399 tagged_above=-10 required=6.6 tests=[ALL_TRUSTED=-1.8, AWL=0.000, BAYES_00=-2.599] Thanks Alex. I found a JIRA that relates to my question https://issues.apache.org/jira/browse/HADOOP-3420 If I decide to do something about this, I'll follow up with HADOOP-3420. Thanks, DougB On Oct 28, 2008, at 5:49 PM, Alex Loddengaard wrote: > I understand your question now, Doug; thanks for clarifying. > However, I > don't think I can give you a great answer. I'll give it a shot, > though: > It does seem like having a single task configuration in theory would > improve > utilization, but it might also make things worse. For example, > generally > speaking, reducers take longer to execute. This means that it would > be > possible for some nodes to only perform reduce tasks for a given > time period > in a setup where each node had a dynamic amount of mappers and > reducers. If > a node was running all reducers, then that node would have lots of > output > data being written to it, hence not evenly distributing data well. > Perhaps > one could argue that over time data would still be distributed evenly, > though. > > That's the best I can do I think. Can others chime in? > > Alex > > On Tue, Oct 28, 2008 at 1:41 PM, Doug Balog wrote: > >> Hi Alex, I'm sorry, I think you misunderstood my question. Let me >> explain >> some more. >> >> I have a hadoop cluster of dual quad core machines. >> I'm using hadoop-0.18.1 with Matei's fairscheduler patch >> https://issues.apache.org/jira/browse/HADOOP-3746 running in FIFO >> mode. >> I have about 5 different jobs running in a pipeline. The number of >> map/reduce tasks per job >> varies based on the input data. >> I assign the various jobs different priorities, and Matei's FIFO >> scheduler >> does almost exactly what I want. >> (The default scheduler did a horrible job with our workload, >> because it >> prefers map tasks.) >> >> I'm trying to tune the tasks per node to fully utilize my cluster, >> my goal >> < 10% idle. >> I'm pretty sure my jobs are cpu bound. I can control the number of >> tasks >> per node by >> setting mapred.tasktracker.map.tasks.maximum >> and mapred.tasktracker.reduce.tasks.maximum in hadoop-site.xml. >> >> But I don't have a fixed number of maps and reduces that I run, so >> saying >> 5+3 tends to leave >> my nodes more idle than I want. I just want to say run 8 tasks per >> node, I >> don't care what the mix between >> map and reduce tasks per node. >> >> I've been wondering why there are separate task limits for map and >> reduce. >>>> Why not a single generic task limit per node ? >>>> >>> >>> >> >> >> The only reason I can think of for having >> separate map and reduce task limits, is the default scheduler. >> It wants to schedule all map tasks first, so you really need to >> limit the >> number of >> them so that reduces have a chance to run. >> >> Thanks for any insight, >> Doug >> >> >> On Oct 27, 2008, at 6:26 PM, Alex Loddengaard wrote: >> >> In most jobs, map and reduce tasks are significantly differently, and >>> their >>> runtimes vary as well. The number of reducers also determines how >>> many >>> output files you have. So in the case when you would want one >>> output >>> file, >>> having a single generic task limit would mean that you'd also have >>> one >>> mapper. This would be quite a limiting setup. >>> Hope this helps. >>> >>> Alex >>> >>> On Mon, Oct 27, 2008 at 1:31 PM, Doug Balog >>> wrote: >>> >>> Hi, >>>> I've been wondering why there are separate task limits for map and >>>> reduce. >>>> Why not a single generic task limit per node ? >>>> >>>> Thanks for any insight, >>>> >>>> Doug >>>> >>>> >>>> >>>> >>