hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: replication
Date Tue, 31 Jan 2012 12:33:14 GMT
The replication factor is per-file (and not HDFS-wide), and can be
controlled by setting "dfs.replication" in your job config to the
desired amount, to affect the whole job.

If you want to use config files itself to propagate this, place your
chosen "dfs.replication" default value inside conf/hdfs-site.xml.

On Tue, Jan 31, 2012 at 5:49 PM, Alieh Saeedi <aliehsaeedi@yahoo.com> wrote:
> As I read in Hadoop tutorial, Hadoop replicate file blocks by a factor
> (default 3), in other words, it replicate each block file 3 times. Does
> Hadoop do it for all files? I mean files written by reducers are replicated
> too?

Yes, all files written to HDFS will be replicated, but you can control
your # of replicas as detailed on top of post. Setting replication
factor to 1 would mean no replication.

Harsh J
Customer Ops. Engineer, Cloudera

View raw message