hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Yamijala <yhema...@gmail.com>
Subject Re: Memory Manager in Hadoop MR
Date Fri, 10 Dec 2010 02:39:23 GMT

On Thu, Dec 9, 2010 at 4:35 PM, Pedro Costa <psdc1978@gmail.com> wrote:
> Hi,
> 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to
> manage memory usage of tasks running under a TaskTracker. Why Hadoop
> MR needs a class to manage memory? Why it couldn't rely on the JVM, or
> this class is here for another purpose?

There are streaming and pipes map/reduce applications that launch
native processes from the map/reduce tasks that are outside the
control of the JVM. Indeed, even regular Java map/reduce programs
could fork/exec other programs. All of these processes could consume
memory that would not be accounted for if we relied only on the JVM to
get the memory usage. Hence a separate class that looks at the entire
process tree of the map/reduce task to account for memory consumed.

> 2 - How the JT knows that a Map or Reduce Task finished? Is through
> the heartbeat?


> Thanks
> --
> Pedro

View raw message