hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin Verlangen <ro...@us2.nl>
Subject HDFS disable balancing cluster
Date Fri, 17 Aug 2012 09:54:19 GMT
Hi there,

We currently run an eight node cluster on Amazon EC2. This is perfect for
our storage, but we want to add a couple of nodes (lets say 32) for
processing a big task. We spin them up, run the jobs, and terminate the

Sounds OK to me, however I'm aware of the fact that hadoop tries to
replicate data blocks to other nodes in favor of balancing the cluster. I
don't want this, as I will get under-replicated blocks when terminating the

We use juju for easy cluster administration. This implies that adding a new
hadoop-slave runs both hdfs and hadoop (mapred).

My main question is, is it possible to disable balancing the cluster, or
just to disable the datanode service on the new nodes (meant for processing

Best regards,

Robin Verlangen
*Software engineer*
W http://www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

View raw message