hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Hadoop performance - xfs and ext4
Date Tue, 11 May 2010 17:51:59 GMT
Ah, one more thing.  With XFS there is an online defragmenter -- it runs every night on my
cluster.  Performance on a fresh, empty system will not match a used one that has become fragmented.


On Apr 22, 2010, at 1:02 AM, stephen mulcahy wrote:

> Hi,
> 
> I've been tweaking our cluster roll-out process to refine it. While 
> doing so, I decided to check if XFS gives any performance benefit over EXT4.
> 
> As per a comment I read somewhere on the hbase wiki - XFS makes for 
> faster formatting of filesystems (it takes us 5.5 minutes to rebuild a 
> datanode from bare metal to a full Hadoop config on top of Debian 
> Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes).
> 
> However, TeraSort performance on a cluster of 45 of these data-nodes 
> shows XFS is slower (same configuration settings on both installs other 
> than changed filesystem), specifically,
> 
> mkfs.xfs -f -l size=64m DEV
> (mounted with noatime,nodiratime,logbufs=8)
> gives me a cluster which runs TeraSort in about 23 minutes
> 
> mkfs.ext4 -T largefile4 DEV
> (mounted with noatime)
> gives me a cluster which runs TeraSort in about 18.5 minutes
> 
> So I'll be rolling our cluster back to EXT4, but thought the information 
> might be useful/interesting to others.
> 
> -stephen
> 
> 
> XFS config chosen from notes at 
> http://everything2.com/index.pl?node_id=1479435
> 
> -- 
> Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
> NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
> http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com


Mime
View raw message