hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Phulari <rphul...@yahoo-inc.com>
Subject Re: Colocation of NameNode and JobTracker
Date Tue, 21 Jul 2009 02:08:06 GMT
Hello Roman ,

If you have huge cluster then its good to have JobTracker and NameNode running on different
machines .
If your cluster is small enough ( ~<20-30 machines ) then you can run JobTracker and NameNode
on same machines .
Again it depends on hardware configuration . Usually  NameNode and Jobtracker machines have
higher configuration compared to data nodes.

It depends on how big is your cluster and how big is your HDFS data .
NameNode memory usage  is directly proportional to the size  of HDFS and number of files/directories
on HDFS.  Each file/directory's metadata and inode information is stored in NameNode namespace(stored
in main memory) which is directly proportional to the number of files and directories on HDFS
 . If you go by byte size used for storing metadata of HDFS file stored in Namespace  NameNode
memory requirements can be summarized as  "10 million files require 4 GB of memory for NameNode"

For a small cluster you can have  NameNode and JobTracker running on the same machine .


On 7/20/09 6:25 PM, "roman kolcun" <roman.wsmo@gmail.com> wrote:

Hello everyone,
is there any performance difference (or any advantage / disadvantage) in
colocating NameNode and JobTracker on the same node? Is it better to put
them on different nodes or on the same one?

Thank you for your answers.

Yours Sincerely,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message