hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Faster alternative to FSDataInputStream
Date Wed, 19 Aug 2009 20:30:52 GMT

On 8/19/09 10:58 AM, "Raghu Angadi" <rangadi@yahoo-inc.com> wrote:

> Edward Capriolo wrote:
>>> On Wed, Aug 19, 2009 at 11:11 AM, Edward Capriolo
>>> <edlinuxguru@gmail.com>wrote:
>>>>>> It would be as fast as underlying filesystem goes.
>>>> I would not agree with that statement. There is overhead.
> You might be misinterpreting my comment. There is of course some over
> head (at the least the procedure calls).. depending on you underlying
> filesystem, there could be extra buffer copies and CRC overhead. But
> none of that explains transfer as slow as 1 MBps (if my interpretation
> of of results is correct).
> Raghu.

Yes, there is nothing about distributing work for parallel execution that is
going to make a single 20MB file transfer faster.   That is very slow, and
should be on the order of a second or so, not multiple minutes.
 Something else is wrong.

View raw message