hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Isaacson <...@cloudera.com>
Subject Re: Optimizing Disk I/O - does HDFS do anything ?
Date Tue, 13 Nov 2012 21:53:33 GMT
On Tue, Nov 13, 2012 at 1:40 PM, Jay Vyas <jayunit100@gmail.com> wrote:
> 1) but I thought that this sort of thing (yes even on linux) becomes
> important when you have large amounts of data - because the way files are
> written can cause issues on highly packed drives.

If you're running any filesystem at 99% full with a workload that
creates or grows files, the filesystem will experience fragmentation.
Don't do that if you want good performance.

As long as there's a few dozen GB of free space to work with, ext4 on
a modern Linux kernel (2.6.38 or newer) will do a fine job of keeping
files sequential and shouldn't need defrag.

To answer the original question -- HDFS doesn't take any special
measures to enforce defragmentation, but HDFS does follow best
practices to avoid causing fragmentation.


View raw message