hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stas Oskin <stas.os...@gmail.com>
Subject Re: HDFS read/write speeds, and read optimization
Date Sun, 17 Jan 2010 17:26:13 GMT

We run with 2-way replication.  The wonderful folks at Yahoo! worked through
> most of the bugs during 0.19.x IIRC.  There was never any bugs with 2-way
> replication per-se, but running a cluster with 2 replicas exposed other bugs
> at a 100x rate compared to running with 3 replicas (due to the fact that a
> silent corruption + loss of a single data node = file loss).
> I'd estimate we lose files at a rate of about 1 per month for 200TB of
> actual data.  That number would probably go down an order of magnitude or
> more if we were running with 3 replicas.
> Hope this helps.
Thanks for sharing!

So, there is a good reason to believe, that version 0.19 and higher have the
file storage / silent corruption issues sorted out?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message