hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Chklovski" <t...@isi.edu>
Subject Dynamic addition and removal of Hadoop nodes
Date Thu, 05 Apr 2007 21:58:12 GMT

We have been experimenting with Hadoop on a largish, but shared cluster.
That means we can allocate various nodes, but would also like to let others
use nodes
(so not having a node permanently is a bit like the situation on EC2).
We are interested in whether other users have developed approaches to get
machines to join (and leave) both the DFS and Tasktracker pools.

It does not seem very complicated, but we are wondering if the brute-force
approach ignores some arcana about such issues as, eg, whether refreshes
should be called on the namenode and the jobtracker.

Also, if we know a node will leave the pool, is there something that we can
the namenode and the jobtracker in advance to make the leaving less
(eg, stop accepting new large jobs, or even go into safe mode?)

-> If people have developed approaches to automating how machines join
and leave pools, we'd love to know.
-> Furthermore, if it makes sense, please consider it a feature request that
be automated/wrapped in scripts that can come with a Hadoop distribution
if everything already works, then extending the documentation on how one
accomplish this correctly).

Thanks much for Hadoop & continued work on it!

-- Tim

Timothy Chklovski
Senior Research Scientist
USC Information Sciences Institute

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message