hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <...@yahoo-inc.com>
Subject Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"
Date Wed, 07 Jan 2009 19:46:01 GMT


tienduc_dinh wrote:
> Hi Konstantin,
> 
> thanks so much for your help. I was a litte bit confused about why my
> setting mapred.map.tasks = 10 in hadoop-site.xml, but hadoop didn't map
> anything. So your answer with 
> 
>> In case of TestDFSIO it will be overridden by "-nrFiles".
> 
> is the key. 
> 
> I need now your confirm to know, if I've understood it right. 

That is correct.

> + If I want to write 2 GB with 1 map task, I should use the following
> command.
> 
>> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 -nrFiles
>> 1 
> 
> The values of throughput are, e.g. 33,60 / 31,48 / 30,95. 
> 
> + If I want to write 2 GB with 4 map tasks, I should use the following
> command.
> 
>> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 5012 -nrFiles
>> 4

You are writing 20GB not 2GB.
Should be 512 instead of 5012.

> The values of throughput are, e.g. 31,50 / 32,09 / 30,56. 
> 
> Can you please explain me, why the values in case 2 are much better. I have
> 1 master and 4 slaves and if I calculate it right, they must be even 4 times
> higher, right ?

throughput is mb/sec per client.
It is great that you get the same numbers for 1 write and 4 parallel writes.
This means that Hadoop on your cluster scales well! :-)

> Sorry for my poor english skill and thanks very much for your help.
> 
> Tien Duc Dinh
> 
> 
> Konstantin Shvachko wrote:
>> Hi tienduc_dinh,
>>
>> Just a bit of a background, which should help to answer your questions.
>> TestDFSIO mappers perform one operation (read or write) each, measure
>> the time taken by the operation and output the following three values:
>> (I am intentionally omitting some other output stuff.)
>> - size(i)
>> - time(i)
>> - rate(i) = size(i) / time(i)
>> i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value,
>> which equals the number of maps.
>>
>> Then the reduce sums those values and writes them into "part-00000".
>> That is you get three fields in it
>> size = size(0) + ... + size(N-1)
>> time = time(0) + ... + time(N-1)
>> rate = rate(0) + ... + rate(N-1)
>>
>> Then we calculate
>> throughput = size / time
>> averageIORate = rate / N
>>
>> So answering your questions
>> - There should be only one reduce task, otherwise you will have to
>> manually sum corresponding values in "part-00000" and "part-00001".
>> - The value of the ":rate" after the reduce equals the sum of individual
>> rates of each operation. So if you want to have an average you should
>> divide it by the number tasks rather than multiply.
>>
>> Now, in your case you create only one file "-nrFiles 1", which means
>> you run only one map task.
>> Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default
>> number of tasks per job. See here
>> http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks
>> In case of TestDFSIO it will be overridden by "-nrFiles".
>>
>> Hope this answers your questions.
>> Thanks,
>> --Konstantin
>>
>>
>>
>> tienduc_dinh wrote:
>>> Hello,
>>>
>>> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and
>>> 4
>>> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
>>> the values "throughput" and "average IO rate" are similar, I just post
>>> the
>>> values of "throughput" of the same command with 3 times running
>>>
>>> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
>>> -nrFiles 1
>>>
>>> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95
>>>
>>> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70
>>>
>>> I find something strange while reading the source code. 
>>>
>>> - The value of mapred.reduce.tasks is always set to 1 
>>>
>>> job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile =
>>> new
>>> Path(WRITE_DIR, "part-00000") in analyzeResult().
>>>
>>> So I think, if we properly have mapred.reduce.tasks = 2, we will have on
>>> the
>>> file system 2 Paths to "part-00000" and "part-00001", e.g.
>>> /benchmarks/TestDFSIO/io_write/part-00000
>>>
>>> - And i don't understand the line with "double med = rate / 1000 /
>>> tasks".
>>> Is it not "double med = rate * tasks / 1000 "
>>
> 

Mime
View raw message