hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: Is it safe to set default/minimum replication to 2?
Date Thu, 22 Jul 2010 18:30:28 GMT

On Jul 21, 2010, at 6:29 PM, Bobby Dennett wrote:

> The team that manages our Hadoop clusters is currently being pressured
> to reduce block replication from 3 to 2 in our production cluster. This
> request is for various reasons -- particularly the reduction of used
> space in the cluster and potential of reduced write operations -- but
> from what I've read previously, it seems to be strongly discouraged.

I'm trying to understand 'potential of reduced write operations'.    Does this mean you want
to reduce the amount of wear and tear on a drive by reducing how many writes are happening
to it?  

If so, you might want to research what your jobs are actually doing.  Since we split our drives
such that we have dedicated MR temp space and dedicated HDFS space, it was pretty obvious
for us to see that the vast majority of our IO (especially writes) is in spills, not from

View raw message