hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: Very weak mapred performance on small clusters with a massive amount of small files
Date Tue, 06 Nov 2007 18:58:26 GMT

I think that would help some, but the real problem for high performance is
disorganized behavior of the disk head.  If the MFIFormat could organize
files according to disk location as well and avoid successive file opens,
you might be OK, but that is asking for the moon.

On 11/6/07 8:14 AM, "Joydeep Sen Sarma" <jssarma@facebook.com> wrote:

> Would it help if the multifileinputformat bundled files into splits based on
> their location? (wondering if remote copy speed is a bottleneck in map)

View raw message