hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From peter <zhangju...@gmail.com>
Subject Re: why is num of map tasks gets overridden?
Date Wed, 22 Aug 2012 06:01:43 GMT
You can consider to add Nodes.  

--  
peter
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On 2012年8月22日Wednesday at 下午1:57, nutch buddy wrote:

> So what can I do If I have a given input, and my job needs a lot of memroy per map task?
> I can't control the amount of map tasks, and my total memory per machine is limited -
I'll eventaully get each machine's memory full.
>  
> On Tue, Aug 21, 2012 at 3:52 PM, Bertrand Dechoux <dechouxb@gmail.com (mailto:dechouxb@gmail.com)>
wrote:
> > > Actually controlling the number of maps is subtle. The mapred.map.tasks parameter
is just a hint to the InputFormat for the number of maps. The default InputFormat behavior
is to split the total number of bytes into the right number of fragments. However, in the
default case the DFS block size of the input files is treated as an upper bound for input
splits. A lower bound on the split size can be set via mapred.min.split.size. Thus, if you
expect 10TB of input data and have 128MB DFS blocks, you'll end up with 82k maps, unless your
mapred.map.tasks is even larger. Ultimately the InputFormat (http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/InputFormat.html)
determines the number of maps.  
> >  
> > http://wiki.apache.org/hadoop/HowManyMapsAndReduces
> >  
> > Bertrand
> >  
> >  
> > On Tue, Aug 21, 2012 at 2:19 PM, nutch buddy <nutch.buddy@gmail.com (mailto:nutch.buddy@gmail.com)>
wrote:
> > >  
> > > I configure a job in hadoop ,set the number of map tasks in the code to 8.
> > >  
> > >  
> > > Then I run the job and it gets 152 map tasks. Can't get why its being overriden
and whhere it get 152 from.
> > >  
> > >  
> > > The mapred-site.xml has 24 as mapred.map.tasks.
> > >  
> > >  
> > > any idea?
> > >  
> > >  
> > >  
> >  
> >  
> >  
> >  
> >  
> > --  
> > Bertrand Dechoux
>  


Mime
View raw message