hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Mituzas <xietao1...@hotmail.com>
Subject Write blocks and their metadata to different disk partitions?
Date Mon, 12 Jul 2010 03:02:20 GMT

hi, all 
I hava an idea that is similar with the issue HDFS-325. I want to reduce the
impact of disk racingwhen read HDFS files. From HDFS code, I know read a
block will also need access its metadata file. I guess there may be some
contention for reading both files from a same disk. So I plan to seperate
the two files.

My datanodes have four disks each, so I hacked the file FSDataset.java and
let the corresponding metadata for each block go to a different disk, e.g.,
file blk_-7553934600807854967_1162.meta and blk_-7553934600807854967_1162
are on /dev/disk1 and /dev/disk2 seperately. 

With this change,  I found for my test code the read performance has ~8%
speedup. (My test first writes files in parrallel with many DFSClients then
read the files.)

Of course, this performance improvment has drawbacks that if one disk is
corrupt, the data on other three are also meaningless any more. But I think
maybe this is acceptable if the whole cluster is huge...

Any comments on this? thanks.
View this message in context: http://old.nabble.com/Write-blocks-and-their-metadata-to-different-disk-partitions--tp28989981p28989981.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.

View raw message