hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GUOJUN Zhu <guojun_...@freddiemac.com>
Subject Re: JVM reuse in Map Tasks
Date Tue, 05 Jun 2012 20:22:21 GMT
Yeah.  I think so.  For a mapper, that is probably not significant as our 
map runs usually takes minutes.  However, we also have it on for combiners 
(same as the reduce class), that becomes significant because a combiner's 
configure() run everytime for each key (quite a few in our case) in the 
end of every map task. 

Zhu, Guojun
Modeling Sr Graduate
571-3824370
guojun_zhu@freddiemac.com
Financial Engineering
Freddie Mac



   Arpit Wanchoo <Arpit.Wanchoo@guavus.com> 
   06/05/2012 03:56 AM
   Please respond to
mapreduce-user@hadoop.apache.org


To
"<mapreduce-user@hadoop.apache.org>" <mapreduce-user@hadoop.apache.org>
cc

Subject
Re: JVM reuse in Map Tasks






Yes I meant the configure(JobConf).
I got that point. 
So that means, setup() is called for each mapper even if JVM reusability 
is enabled.

If i understood correctly, 
then if I initialize a static variable (say var) in setup() and when 
mapper is started for the 2nd time on same JVM, the that var would be 
already initialized before setup() is called i.e it is retaining its value 
from previously run mapper.
Is this the way ?



Regards,
Arpit Wanchoo | Sr. Software Engineer
Guavus Network Systems.
6th Floor, Enkay Towers, Tower B & B1,Vanijya Nikunj, Udyog Vihar Phase - 
V, Gurgaon,Haryana.
Mobile Number +91-9899949788 

On 04-Jun-2012, at 6:36 PM, GUOJUN Zhu wrote:


For setup(), do you mean configure(JobConf)?     We need to deserialize a 
big object and do some other preparing work on it within the configure() 
for setting up. It takes a few seconds and it is the same for all task. We 
just declare the object as static and do not recreate it if it is not 
null.  By that way, we make sure only create it once and save the setup 
time for the rest of the tasks.   

Zhu, Guojun
Modeling Sr Graduate
571-3824370
guojun_zhu@freddiemac.com
Financial Engineering
Freddie Mac 


   Arpit Wanchoo <Arpit.Wanchoo@guavus.com> 
   06/04/2012 08:12 AM 

   Please respond to
mapreduce-user@hadoop.apache.org



To
"mapreduce-user@hadoop.apache.org" <mapreduce-user@hadoop.apache.org> 
cc

Subject
JVM reuse in Map Tasks








Hi 

I wanted to check what exactly we gain  when JVM reusability is enabled in 
mapped job. 

My doubt was regarding the setup() method of mapper. Is it called for a 
mapper even if it is using the JVM for previously run mapper ? 
If yes then is there any way I can control it or stop from being called 
more than once. 

Regards,
Arpit Wanchoo | Sr. Software Engineer
Guavus Network Systems.
6th Floor, Enkay Towers, Tower B & B1,Vanijya Nikunj, Udyog Vihar Phase - 
V, Gurgaon,Haryana.
Mobile Number +91-9899949788 



Mime
View raw message