hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Interesting Hadoop/FUSE-DFS access patterns
Date Mon, 13 Apr 2009 23:01:43 GMT

On 4/12/09 9:41 PM, "Brian Bockelman" <bbockelm@cse.unl.edu> wrote:

> Ok, here's something perhaps even more strange.  I removed the "seek"
> part out of my timings, so I was only timing the "read" instead of the
> "seek + read" as in the first case.  I also turned the read-ahead down
> to 1-byte (aka, off).
> The jump *always* occurs at 128KB, exactly.

Some random ideas:

I have no idea how FUSE interops with the Linux block layer, but 128K
happens to be the default 'readahead' value for block devices, which may
just be a coincidence.

For a disk 'sda', you check and set the value (in 512 byte blocks) with:

/sbin/blockdev --getra /dev/sda
/sbin/blockdev --setra [num blocks] /dev/sda

I know on my file system tests, the OS readahead is not activated until a
series of sequential reads go through the block device, so truly random
access is not affected by this.  I've set it to 128MB and random iops does
not change on a ext3 or xfs file system.  If this applies to FUSE too, there
may be reasons that this behavior differs.
Furthermore, one would not expect it to be slower to randomly read 4k than
randomly read up to the readahead size itself even if it did.

I also have no idea how much of the OS device queue and block device
scheduler is involved with FUSE.  If those are involved, then there's a
bunch of stuff to tinker with there as well.

Lastly, an FYI if you don't already know the following.  If the OS is
caching pages, there is a way to flush these in Linux to evict the cache.
See /proc/sys/vm/drop_caches .

> I'm a bit befuddled.  I know we say that HDFS is optimized for large,
> sequential reads, not random reads - but it seems that it's one bug-
> fix away from being a good general-purpose system.  Heck if I can find
> what's causing the issues though...
> Brian

View raw message