hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peyman Mohajerian <mohaj...@gmail.com>
Subject Re: recommended block replication for small cluster
Date Thu, 03 Apr 2014 13:13:22 GMT
The reason for replication also has to do with data locality in a larger
cluster for running a map-reduce jobs. You can reduce the replication,
that's why it's a configurable parameter.


On Thu, Apr 3, 2014 at 7:10 AM, Fengyun RAO <raofengyun@gmail.com> wrote:

> I know the default replication is 3, which ensures reliability when 2
> nodes crash at the same time.
>
> However, for a small cluster, e.g. 10~20 nodes, the possibility that 2
> nodes crash at the same time is too small.
>
> Can we simply set the replication to 2, or are there any other defects?
>
> any information are appreciated!
>

Mime
View raw message