hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: Hadoop hardware failure recovery
Date Mon, 13 Aug 2012 16:10:47 GMT
On 13 August 2012 08:42, Harsh J <harsh@cloudera.com> wrote:

> Hey Steve,
> Interesting, thanks for pointing that out! I didn't know that it
> disables this by default :)
It's always something to watch out for: someone implementing a disk FS, OS,
VM environment discovering that they get great benchmark numbers if they
make flushing async, and thinking "most people don't need it anyway".
That's mostly true -and some programs over-flush-, but if you do want to be
sure your data is saved to disk, these people are being dangerous rather
than helpful

I don't think it's an issue if you are saving to network mounted storage
-which can include the storage of the host OS. If you do some experiments,
NFS chat to the host system is usually as fast as working with a virtual
HDD in the same host OS -which has extra layers of indirection. Virtual
HDDs can get fragmented even when the VM thinks it's just allocated big
linear blocks -you need to defrag the virtual HDD then the physical disk
image to correct that.

View raw message