hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: HDFS performance with an without replication
Date Sun, 15 Sep 2013 18:39:22 GMT
Write performance improves with lesser replicas (as a result of
synchronous and sequenced write pipelines in HDFS). Reads would be the
same, unless you're unable to schedule a rack-local read (at worst
case) due to only one (busy) rack holding it.

On Sun, Sep 15, 2013 at 10:38 PM, John Lilley <john.lilley@redpoint.net> wrote:
> In our YARN application, we are considering whether to store temporary data
> with replication=1 or replication=3 (or give the user an option).  Obviously
> there is a tradeoff between reliability and performance, but on smaller
> clusters I’d expect this to be less of an issue.
> What is the difference in write performance using replication=1 vs 3?  For
> reading I’d expect the performance to be roughly requivalent.
> john

Harsh J

View raw message