hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Evans <ev...@yahoo-inc.com>
Subject Re: how to restrict the concurrent running map tasks?
Date Fri, 18 Jan 2013 15:49:25 GMT
General is for product announcements and the like.  You really should
direct your question to mapreduce-user@.  I have bcced general.

I am not an expert on this, but I looked and it appears that you have to
use a special scheduler in the JobTracker to make this happen.

org.apache.hadoop.mapred.LimitTasksPerJobTaskScheduler


It looks a lot like the fifo scheduler but with a limit on the number of
tasks.  I am not sure it this is something that will work for you or not.

--Bobby

On 1/18/13 4:22 AM, "hwang" <joe.haiwang@gmail.com> wrote:

>Hi all:
>
>My hadoop version is 1.0.2. Now I want at most 10 map tasks running at the
>same time. I have found 2 parameter related to this question.
>
>a) mapred.job.map.capacity
>
>but in my hadoop version, this parameter seems abandoned.
>
>b) mapred.jobtracker.taskScheduler.maxRunningTasksPerJob (
>http://grepcode.com/file/repo1.maven.org/maven2/com.ning/metrics.collector
>/1.0.2/mapred-default.xml
>)
>
>I set this variable like below:
>
>Configuration conf = new Configuration();
>conf.set("date", date);
>conf.set("mapred.job.queue.name", "hadoop");
>conf.set("mapred.jobtracker.taskScheduler.maxRunningTasksPerJob", "10");
>
>DistributedCache.createSymlink(conf);
>Job job = new Job(conf, "ConstructApkDownload_" + date);
>...
>
>The problem is that it doesn't work. There is still more than 50 maps
>running as the job starts.
>
>I'm not sure whether I set this parameter in wrong way ? or misunderstand
>it.
>
>After looking through the hadoop document, I can't find another parameter
>to limit the concurrent running map tasks.
>
>Hope someone can help me ,Thanks.


Mime
View raw message