hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Kimball" <aa...@cloudera.com>
Subject Re: Hadoop Internal Architecture writeup
Date Wed, 10 Dec 2008 03:43:38 GMT
Good stuff -- please link to this resource from the Hadoop wiki (on the
"articles" page?)

- Aaron

On Sun, Nov 30, 2008 at 8:01 PM, Amar Kamat <amarrk@yahoo-inc.com> wrote:

> Hey, nice work and nice writeup. Keep it up.
> Comments inline.
> Amar
>
>
> -----Original Message-----
> From: Ricky Ho [mailto:rho@adobe.com]
> Sent: Fri 11/28/2008 9:45 AM
> To: core-user@hadoop.apache.org
> Subject: RE: Hadoop Internal Architecture writeup
>
> Amar, thanks a lot.  This is exactly the kind of feedback that I am looking
> for ...  I have some more question ...
>
> ==================
> The jobclient while submitting the job calculates the split using
> InputFormat which is specified by the user. Internally the InputFormat
> might make use of dfs-block size, user-hinted num-maps etc. The
> jobtracker is given 3 files
> - job.xml : job control parameters
> - job.split : the split file
> - job.jar : user map-reduce code
> ==================
> [Ricky]  What exactly does the job.split contains ?  I assume it contains
> the specification for each split (but not its data), such as what is the
> corresponding file and the byte range within that file.  Correct ?
>
> ====================
> This process is interleaved/parallelized. As soon as a map is done, the
> JobTracker is notified. Once a tracker (with a reducer) asks for events,
> these new events are passed. Hence the map output pulling (Shuffle
> Phase) works in parallel with the Map Phase. Reduce Phase can start only
> once all the (resp) map outputs are copied and merged.
> =====================
> [Ricky]  I am curious about why can't the reduce execution start earlier
> (before all the map tasks completed).  The value "iterator" inside the
> used-defined reduce() method can be blocked to wait for more map tasks
> completion.  In other words, the map() and reduce() can also be proceeding
> in a pipeline parallelism.
>
>
> ======================
> There is a 1-1 mapping between a split and a map task. Hence it will
> re-run the map on the corresponding split.
> ======================
> [Ricky]  Do you mean if the job has 5000 splits, then it requires 5000
> TaskTrackers VM (one for each split) ?
>
> comment:
> If the job has 5000 splits, then it requires 5000 VMs (one for each split).
> TaskTracker is a framework daemon. TaskTracker is a process (JVM) that
> handles/manages tasks (processes processing a split) on a node. A
> TaskTracker is recognized by (node-hostname + port). A task is never
> executed in a TaskTracker and new jvm is spawned. The reason being that a
> faulty usercode(map/reduce) should not bring down a TaskTracker (a framework
> process). But with hadoop-0.19 we have jvm reuse and hence 5000 splits might
> require < 5000 VMs. Note that tasks in the end might get speculated which
> might add to the VM count.
> Amar
>
>
> =======================
> The client is unblocked once the job is submitted. The way it works is
> as follows :
> - jobclient requests the jobtracker for a unique job id
> - jobclient does some sanity checks to see if the output folder exists
> etc ...
> - jobclient uploads job files (xml, jar, split) onto a known location
> called System-Directory
> ========================
> [Ricky]  Is this a well-know folder within the HDFS ?
>
> This is set using "mapred.system.dir" during cluster startup (see
> hadoop-default.conf). Its a framework directory.
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message