hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mithila Nagendra <mnage...@asu.edu>
Subject Re: More Replication on dfs
Date Fri, 10 Apr 2009 05:26:05 GMT
To add to the question, how does one decide what is the optimal replication
factor for a cluster. For instance what would be the appropriate replication
factor for a cluster consisting of 5 nodes.
Mithila

On Fri, Apr 10, 2009 at 8:20 AM, Alex Loddengaard <alex@cloudera.com> wrote:

> Did you load any files when replication was set to 3?  If so, you'll have
> to
> rebalance:
>
> <http://hadoop.apache.org/core/docs/r0.19.1/commands_manual.html#balancer>
> <
> http://hadoop.apache.org/core/docs/r0.19.1/hdfs_user_guide.html#Rebalancer
> >
>
> Note that most people run HDFS with a replication factor of 3.  There have
> been cases when clusters running with a replication of 2 discovered new
> bugs, because replication is so often set to 3.  That said, if you can do
> it, it's probably advisable to run with a replication factor of 3 instead
> of
> 2.
>
> Alex
>
> On Thu, Apr 9, 2009 at 9:56 PM, Puri, Aseem <Aseem.Puri@honeywell.com
> >wrote:
>
> > Hi
> >
> >            I am a new Hadoop user. I have a small cluster with 3
> > Datanodes. In hadoop-site.xml values of dfs.replication property is 2
> > but then also it is replicating data on 3 machines.
> >
> >
> >
> > Please tell why is it happening?
> >
> >
> >
> > Regards,
> >
> > Aseem Puri
> >
> >
> >
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message