hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Milind Bhandarkar <mbhandar...@linkedin.com>
Subject Re: Hadoop use direct I/O in Linux?
Date Wed, 05 Jan 2011 22:03:33 GMT
I agree with Jay B. Checksumming is usually the culprit for high CPU on clients and datanodes.
Plus, a checksum of 4 bytes for every 512, means for 64MB block, the checksum will be 512KB,
i.e. 128 ext3 blocks. Changing it to generate 1 ext3 checksum block per DFS block will speedup
read/write without any loss of reliability.

- milind

---
Milind Bhandarkar
(mbhandarkar@linkedin.com)
(650-776-3236)







Mime
View raw message