Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 48DA1985B for ; Sat, 10 Mar 2012 10:40:02 +0000 (UTC) Received: (qmail 13257 invoked by uid 500); 10 Mar 2012 10:40:01 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 13151 invoked by uid 500); 10 Mar 2012 10:40:00 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 13136 invoked by uid 99); 10 Mar 2012 10:40:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Mar 2012 10:40:00 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ramon_wang@hotmail.com designates 65.54.190.30 as permitted sender) Received: from [65.54.190.30] (HELO bay0-omc1-s19.bay0.hotmail.com) (65.54.190.30) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Mar 2012 10:39:49 +0000 Received: from BAY170-W31 ([65.54.190.61]) by bay0-omc1-s19.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Sat, 10 Mar 2012 02:39:28 -0800 Message-ID: Content-Type: multipart/alternative; boundary="_26b88c00-e886-4a1b-b9c1-ec91e346ff15_" X-Originating-IP: [123.121.17.133] From: WangRamon To: Subject: Why most of the free reduce slots are NOT used for my Hadoop Jobs? Thanks. Date: Sat, 10 Mar 2012 18:39:28 +0800 Importance: Normal MIME-Version: 1.0 X-OriginalArrivalTime: 10 Mar 2012 10:39:28.0871 (UTC) FILETIME=[121D5770:01CCFEAA] X-Virus-Checked: Checked by ClamAV on apache.org --_26b88c00-e886-4a1b-b9c1-ec91e346ff15_ Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: 8bit Hi All I'm using Hadoop-0.20-append, the cluster contains 3 nodes, for each node I have 14 map and 14 reduce slots, here is the configuration: mapred.tasktracker.map.tasks.maximum 14 mapred.tasktracker.reduce.tasks.maximum 14 mapred.reduce.tasks 73 When I submit 5 Jobs simultaneously (the input data for each job is not so big for the test, it's about 2~5M in size), I assume the Jobs will use the slots as much as possible, each Job did created 73 Reduce Tasks as configured above, so there will be 5 * 73 Reduce Tasks in total, but, most of them are in pending state, only about 12 of them are running, it's too small compared to the total slots number for reduce, 42 reduce slots for the 3 nodes cluster. What interestring is that it always about 12 of them are running, I tried a few times. So, I thought it might because about the scheduler, I changed it to Fair Scheduler, I created 3 pools, the configure is as below: 14 14 1.0 14 14 1.0 14 14 1.0 Then I submit the 5 Jobs simultaneously to these pools randomly again, I can see the jobs were assigned to different pools, but, it's still the same problem only about 12 of the reduce tasks from different pool are running, here is the output i copied from the Fair Scheduler monitor GUI: pool-a 2 14 14 0 9 pool-b 0 14 14 0 0 pool-c 2 14 14 0 3 pool-a and pool-c have a total of 12 reduce tasks running, but I do have about 11 reduce slots at least available in my cluster. So can anyone please give me some suggestions, why NOT all my REDUCE SLOTS are working? Thanks in advance. Cheers Ramon --_26b88c00-e886-4a1b-b9c1-ec91e346ff15_ Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: 8bit

Hi All

I'm using Hadoop-0.20-append, the cluster contains 3 nodes, for each node I have 14 map and 14 reduce slots, here is the configuration:

    <property>
        <name>mapred.tasktracker.map.tasks.maximum</name>
        <value>14</value>
    </property>
    <property>
        <name>mapred.tasktracker.reduce.tasks.maximum</name>
        <value>14</value>
    </property>
    <property>
        <name>mapred.reduce.tasks</name>
        <value>73</value>
    </property>

When I submit 5 Jobs simultane ously (the input data for each job is not so big for the test, it's about 2~5M in size), I assume the Jobs will use the slots as much as possible, each Job did created 73 Reduce Tasks as configured above, so there will be 5 * 73 Reduce Tasks in total, but, most of them are in pending state, only about 12 of them are running, it's too small compared to the total slots number for reduce, 42 reduce slots for the 3 nodes cluster.

What interestring is that it always about 12 of them are running, I tried a few times.

So, I thought it might because about the scheduler, I changed it to Fair Scheduler, I created 3 pools, the configure is as below:

<?xml version="1.0"?>
<allocations>
<pool name="pool-a">
  <minMaps>14</minMaps>
  <minReduces>14</minReduces>
  <weight>1.0</weight>
</pool>
<pool name="pool -b">
  <minMaps>14</minMaps>
  <minReduces>14</minReduces>
  <weight>1.0</weight>
</pool>
<pool name="pool-c">
  <minMaps>14</minMaps>
  <minReduces>14</minReduces>
  <weight>1.0</weight>
</pool>

</allocations>

Then I submit the 5 Jobs simultaneously to these pools randomly again, I can see the jobs were assigned to different pools, but, it's still the same problem only about 12 of the reduce tasks from different pool are running, here is the output i copied from the Fair Scheduler monitor GUI:

pool-a 2 14 14 0 9
pool-b 0 14 14 0 0
pool-c 2 14 14 0 3

pool-a and pool-c have a total of 12 reduce tasks running, but I do have about 11 reduce slots at least available in my cluster.

So can anyone please give me some suggestions, why NOT all my REDUCE SLOTS are working? Thanks in advance.

Cheers
Ramon

--_26b88c00-e886-4a1b-b9c1-ec91e346ff15_--