hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "tim robertson" <timrobertson...@gmail.com>
Subject Re: Hadoop Internal Architecture writeup
Date Fri, 28 Nov 2008 07:37:43 GMT
Hi Ricky,

As a newcomer to MR and Hadoop I think what you are doing is a great
addition to the docs.  One thing I would like to see in this overview
is how JVM's are spawned in the process - e.g. is it 1 JVM per node
per job, or per node per task etc.  The reason being it has
implications about setting up JVM wide in-memory caches for the Maps /
Reducers to utilise (E.g. a big lookup hashtable which the Map will
use).  I assume it is a JVM per task per node to allow the task
tracker to kill the process and restrict memory to a task, but I am no

Thanks again


On Fri, Nov 28, 2008 at 7:59 AM, Amar Kamat <amarrk@yahoo-inc.com> wrote:
> Ricky Ho wrote:
>> I put together an article describing the internal architecture of Hadoop
>> (HDFS, MapRed).  I'd love to get some feedback if you see anything
>> inaccurate or missing ...
>> http://horicky.blogspot.com/2008/11/hadoop-mapreduce-implementation.html
> Few comments on MR :
> 1) "The JobTracker will first determine the number of splits (each split is
> configurable, ~16-64MB) from the input path, and select some TaskTracker
> based on their network proximity to the data sources, then the JobTracker
> send the task requests to those selected TaskTrackers."
> The jobclient while submitting the job calculates the split using
> InputFormat which is specified by the user. Internally the InputFormat might
> make use of dfs-block size, user-hinted num-maps etc. The jobtracker is
> given 3 files
> - job.xml : job control parameters
> - job.split : the split file
> - job.jar : user map-reduce code
> Hence the JobTracker never actually *calculates* the split.
> Also the task-tracker asks for the task. Its a pull method where the tracker
> ask and the jobtracker schedules based on (node/rack/switch) locality.
> JobTracker never actually initiates scheduling.
> 2) For each record parsed by the "InputFormat"
> To be more precise "RecordReader" included in the "InputFormat".
> 3) A periodic wakeup process will sort the memory buffer into different
> reducer node by invoke the "combine" function.
> Need to cross check if its a wakeup process or a on-demand thread that is
> spawned once the buffer is nearly full. Btw the function that determined
> which key-val goes to which reducer is called "Partitioner". Combiner is
> just an optimization that does a local merge/reduction/aggregation of the
> data before sending it over the network.
> 4) When all the TaskTrackers are done, the JobTracker will notify the
> selected TaskTrackers for the reduce phase.
> This process is interleaved/parallelized. As soon as a map is done, the
> JobTracker is notified. Once a tracker (with a reducer) asks for events,
> these new events are passed. Hence the map output pulling (Shuffle Phase)
> works in parallel with the Map Phase. Reduce Phase can start only once all
> the (resp) map outputs are copied and merged.
> 5) The JobTracker keep tracks of the progress of each phases and
> periodically ping the TaskTracker for their health status.
> Its again a push rather than a pull. Trackers report their status instead of
> JobTracker asking them.
> 6) When any of the map phase TaskTracker crashes, the JobTracker will
> reassign the map task to a different TaskTracker node, which will rerun all
> the assigned splits.
> There is a 1-1 mapping between a split and a map task. Hence it will re-run
> the map on the corresponding split.
> 7) After both phase completes, the JobTracker will unblock the client
> program.
> The client is unblocked once the job is submitted. The way it works is as
> follows :
> - jobclient requests the jobtracker for a unique job id
> - jobclient does some sanity checks to see if the output folder exists etc
> ...
> - jobclient uploads job files (xml, jar, split) onto a known location called
> System-Directory
> - jobclient informs the jobtracker that the files are ready and the
> jobtracker returns the control.
> Its not necessary that the jobclient always gets blocked on job submission.
> Amar
>> Rgds,
>> Ricky

View raw message