hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <ja...@attributor.com>
Subject Re: How can I control Number of Mappers of a job?
Date Sat, 02 Aug 2008 00:11:23 GMT
We control the number of map tasks by carefully managing the input split 
size when we need to.
This may require using the multiplefileinput classes or aggregating your 
input files before hand.
You need to have some aggregation either by contactination or the 
MultipleFileInput if you have more input files than you want map tasks.  

The case of 1 mapper per input file requires setting the inputsplitsize 
to Long.MAX_SIZE (see the datajoin classes for examples)

paul wrote:
> I've talked to a few people that claim to have done this as a way to limit
> resources for different groups, like developers versus production jobs.
> Haven't tried it myself yet, but it's getting close to the top of my to-do
> list.
> -paul
> On Fri, Aug 1, 2008 at 1:36 PM, James Moore <jamesthepiper@gmail.com> wrote:
>> On Thu, Jul 31, 2008 at 12:30 PM, Gopal Gandhi
>> <gopal.gandhi2008@yahoo.com> wrote:
>>> Thank you, finally someone has interests in my questions =)
>>> My cluster contains more than one machine. Please don't get me wrong :-).
>> I don't want to limit the total mappers in one node (by mapred.map.tasks).
>> What I want is to limit the total mappers for one job. The motivation is
>> that I have 2 jobs to run at the same time. they have "the same input data
>> in Hadoop". I found that one job has to wait until the other finishes its
>> mapping. Because the 2 jobs are submitted by 2 different people, I don't
>> want one job to be starving. So I want to limit the first job's total
>> mappers so that the 2 jobs will be launched simultaneously.
>> What about running two different jobtrackers on the same machines,
>> looking at the same DFS files?  Never tried it myself, but it might be
>> an approach.
>> --
>> James Moore | james@restphone.com
>> Ruby and Ruby on Rails consulting
>> blog.restphone.com

Jason Venner
Attributor - Program the Web <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers and coding wizards, contact if 

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message