hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stephen mulcahy <stephen.mulc...@deri.org>
Subject Hadoop performance - xfs and ext4
Date Thu, 22 Apr 2010 08:02:58 GMT

I've been tweaking our cluster roll-out process to refine it. While 
doing so, I decided to check if XFS gives any performance benefit over EXT4.

As per a comment I read somewhere on the hbase wiki - XFS makes for 
faster formatting of filesystems (it takes us 5.5 minutes to rebuild a 
datanode from bare metal to a full Hadoop config on top of Debian 
Squeeze using XFS) versus EXT4 (same bare metal restore takes 9 minutes).

However, TeraSort performance on a cluster of 45 of these data-nodes 
shows XFS is slower (same configuration settings on both installs other 
than changed filesystem), specifically,

mkfs.xfs -f -l size=64m DEV
(mounted with noatime,nodiratime,logbufs=8)
gives me a cluster which runs TeraSort in about 23 minutes

mkfs.ext4 -T largefile4 DEV
(mounted with noatime)
gives me a cluster which runs TeraSort in about 18.5 minutes

So I'll be rolling our cluster back to EXT4, but thought the information 
might be useful/interesting to others.


XFS config chosen from notes at 

Stephen Mulcahy, DI2, Digital Enterprise Research Institute,
NUI Galway, IDA Business Park, Lower Dangan, Galway, Ireland
http://di2.deri.ie    http://webstar.deri.ie    http://sindice.com

View raw message