hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Poor IO performance on a 10 node cluster.
Date Mon, 30 May 2011 17:32:00 GMT
Psst. The cats speak in their own language ;-)

On Mon, May 30, 2011 at 10:31 PM, James Seigel <james@tynt.com> wrote:
> Not sure that will help ;)
>
> Sent from my mobile. Please excuse the typos.
>
> On 2011-05-30, at 9:23 AM, Boris Aleksandrovsky <baleksan@gmail.com> wrote:
>
>> Ljddfjfjfififfifjftjiiiiiifjfjjjffkxbznzsjxodiewisshsudddudsjidhddueiweefiuftttoitfiirriifoiffkllddiririiriioerorooiieirrioeekroooeoooirjjfdijdkkduddjudiiehs
>> On May 30, 2011 5:28 AM, "Gyurib√°csi" <bogyom74@gmail.com> wrote:
>>>
>>>
>>> Hi,
>>>
>>> I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT
>>> cores).
>>>
>>> I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming
>> jar
>>> with 'wc -l' as mapper and 'cat' as reducer.
>>>
>>> I use 64MB block size and the default replication (3).
>>>
>>> The wc on the 100 GB took about 220 seconds which translates to about 3.5
>>> Gbit/sec processing speed. One disk can do sequential read with 1Gbit/sec
>> so
>>> i would expect someting around 20 GBit/sec (minus some overhead), and I'm
>>> getting only 3.5.
>>>
>>> Is my expectaion valid?
>>>
>>> I checked the jobtracked and it seems all nodes are working, each reading
>>> the right blocks. I have not played with the number of mapper and reducers
>>> yet. It seems the number of mappers is the same as the number of blocks
>> and
>>> the number of reducers is 20 (there are 20 disks). This looks ok for me.
>>>
>>> We also did an experiment with TestDFSIO with similar results. Aggregated
>>> read io speed is around 3.5Gbit/sec. It is just too far from my
>>> expectation:(
>>>
>>> Please help!
>>>
>>> Thank you,
>>> Gyorgy
>>> --
>>> View this message in context:
>> http://old.nabble.com/Poor-IO-performance-on-a-10-node-cluster.-tp31732971p31732971.html
>>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>>>
>



-- 
Harsh J

Mime
View raw message