hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yu Li <car...@gmail.com>
Subject Re: Dynamically set mapred.tasktracker.map.tasks.maximum from inside a job.
Date Wed, 30 Jun 2010 13:56:31 GMT
Hi Pierre,

The "setNumReduceTasks" method is for setting the number of reduce tasks to
launch, it's equal to set the "mapred.reduce.tasks" parameter, while the
"mapred.tasktracker.reduce.tasks.maximum" parameter decides the number of
tasks running *concurrently* on one node.
And as Amareshwari mentioned, the
"mapred.tasktracker.map/reduce.tasks.maximum" is a cluster configuration
which could not be set per job. If you set
mapred.tasktracker.map.tasks.maximum to 20, and the overall number of map
tasks is larger than 20*<nodes number>, there would be 20 map tasks running
concurrently on a node. As I know, you probably need to restart the
tasktracker if you truely need to change the configuration.

Best Regards,
Carp

2010/6/30 Pierre ANCELOT <pierreact@gmail.com>

> Sure, but not the number of tasks running concurrently on a node at the
> same
> time.
>
>
>
> On Wed, Jun 30, 2010 at 1:57 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > The number of map tasks is determined by InputSplit.
> >
> > On Wednesday, June 30, 2010, Pierre ANCELOT <pierreact@gmail.com> wrote:
> > > Hi,
> > > Okay, so, if I set the 20 by default, I could maybe limit the number of
> > > concurrent maps per node instead?
> > > job.setNumReduceTasks exists but I see no equivalent for maps, though I
> > > think there was a setNumMapTasks before...
> > > Was it removed? Why?
> > > Any idea about how to acheive this?
> > >
> > > Thank you.
> > >
> > >
> > > On Wed, Jun 30, 2010 at 12:08 PM, Amareshwari Sri Ramadasu <
> > > amarsri@yahoo-inc.com> wrote:
> > >
> > >> Hi Pierre,
> > >>
> > >> "mapred.tasktracker.map.tasks.maximum" is a cluster level
> configuration,
> > >> cannot be set per job. It is loaded only while bringing up the
> > TaskTracker.
> > >>
> > >> Thanks
> > >> Amareshwari
> > >>
> > >> On 6/30/10 3:05 PM, "Pierre ANCELOT" <pierreact@gmail.com> wrote:
> > >>
> > >> Hi everyone :)
> > >> There's something I'm probably doing wrong but I can't seem to figure
> > out
> > >> what.
> > >> I have two hadoop programs running one after the other.
> > >> This is done because they don't have the same needs in term of
> processor
> > in
> > >> memory, so by separating them I optimize each task better.
> > >> Fact is, I need for the first job on every node
> > >> mapred.tasktracker.map.tasks.maximum set to 12.
> > >> For the second task, I need it to be set to 20.
> > >> so by default I set it to 12 and in the second job's code, I set this:
> > >>
> > >>        Configuration hadoopConfiguration = new Configuration();
> > >>
> >  hadoopConfiguration.setInt("mapred.tasktracker.map.tasks.maximum",
> > >> 20);
> > >>
> > >> But when running the job, instead of having the 20 tasks on each node
> as
> > >> expected, I have 12....
> > >> Any idea please?
> > >>
> > >> Thank you.
> > >> Pierre.
> > >>
> > >>
> > >> --
> > >> http://www.neko-consulting.com
> > >> Ego sum quis ego servo
> > >> "Je suis ce que je protège"
> > >> "I am what I protect"
> > >>
> > >>
> > >
> > >
> > > --
> > > http://www.neko-consulting.com
> > > Ego sum quis ego servo
> > > "Je suis ce que je protège"
> > > "I am what I protect"
> > >
> >
>
>
>
> --
>  http://www.neko-consulting.com
> Ego sum quis ego servo
> "Je suis ce que je protège"
> "I am what I protect"
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message