hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Babu <manoj...@gmail.com>
Subject doubt on Hadoop job submission process
Date Mon, 13 Aug 2012 10:12:19 GMT
Hi All,

Normal Hadoop job submission process involves:

   1. Checking the input and output specifications of the job.
   2. Computing the
InputSplit<http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/InputSplit.html>s
   for the job.
   3. Setup the requisite accounting information for the
DistributedCache<http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/filecache/DistributedCache.html>
of
   the job, if necessary.
   4. Copying the job's jar and configuration to the map-reduce system
   directory on the distributed file-system.
   5. Submitting the job to the JobTracker and optionally monitoring it's
   status.

I have a doubt in 4th point of  job execution flow could any of you explain
it?

   - What is job's jar?
   - Is it job's jar is the one we submitted to hadoop or hadoop will build
   based on the job configuration object?



Cheers!
Manoj.

Mime
View raw message