hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: OK to run data node on same machine as secondary name node?
Date Thu, 16 Aug 2012 04:58:02 GMT
I'd not do this if the fsimage size is greater than, say, 5-6 GB. The
SNN pulls and then pushes this back from the NameNode and the transfer
can get heavy. If you have
https://issues.apache.org/jira/browse/HDFS-1457 (image transfer
throttler) in the version of Hadoop you use, you can set it to a
proper value and keep the SNN on a slave node without worrying about
it hogging all the available bandwidth.

On Thu, Aug 16, 2012 at 3:41 AM, David Rosenstrauch <darose@darose.net> wrote:
> I have a Hadoop cluster that's a little tight on resources.  I was thinking
> one way I could solve this could be by running an additional data node on
> the same machine as the secondary name node.
> I wouldn't dare do that on the primary name node, since that machine needs
> to be extremely performant.  But since all the secondary name node does is
> doing a merge of the name node's checkpoint and logs, which is not an
> activity that require top-notch real-time performance, I thought it might
> not be a problem if I were to set up a data node running there as well.
> Any reasons why that might be a bad idea?
> Thanks,
> DR

Harsh J

View raw message