hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tienduc_dinh <tienduc_d...@yahoo.com>
Subject Re: TestDFSIO delivers bad values of "throughput" and "average IO rate"
Date Wed, 07 Jan 2009 14:51:18 GMT

Hi Konstantin,

thanks so much for your help. I was a litte bit confused about why my
setting mapred.map.tasks = 10 in hadoop-site.xml, but hadoop didn't map
anything. So your answer with 

> In case of TestDFSIO it will be overridden by "-nrFiles".

is the key. 

I need now your confirm to know, if I've understood it right. 

+ If I want to write 2 GB with 1 map task, I should use the following
command.

> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 -nrFiles
> 1 

The values of throughput are, e.g. 33,60 / 31,48 / 30,95. 

+ If I want to write 2 GB with 4 map tasks, I should use the following
command.

> hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 5012 -nrFiles
> 4

The values of throughput are, e.g. 31,50 / 32,09 / 30,56. 

Can you please explain me, why the values in case 2 are much better. I have
1 master and 4 slaves and if I calculate it right, they must be even 4 times
higher, right ?

Sorry for my poor english skill and thanks very much for your help.

Tien Duc Dinh


Konstantin Shvachko wrote:
> 
> Hi tienduc_dinh,
> 
> Just a bit of a background, which should help to answer your questions.
> TestDFSIO mappers perform one operation (read or write) each, measure
> the time taken by the operation and output the following three values:
> (I am intentionally omitting some other output stuff.)
> - size(i)
> - time(i)
> - rate(i) = size(i) / time(i)
> i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value,
> which equals the number of maps.
> 
> Then the reduce sums those values and writes them into "part-00000".
> That is you get three fields in it
> size = size(0) + ... + size(N-1)
> time = time(0) + ... + time(N-1)
> rate = rate(0) + ... + rate(N-1)
> 
> Then we calculate
> throughput = size / time
> averageIORate = rate / N
> 
> So answering your questions
> - There should be only one reduce task, otherwise you will have to
> manually sum corresponding values in "part-00000" and "part-00001".
> - The value of the ":rate" after the reduce equals the sum of individual
> rates of each operation. So if you want to have an average you should
> divide it by the number tasks rather than multiply.
> 
> Now, in your case you create only one file "-nrFiles 1", which means
> you run only one map task.
> Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default
> number of tasks per job. See here
> http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks
> In case of TestDFSIO it will be overridden by "-nrFiles".
> 
> Hope this answers your questions.
> Thanks,
> --Konstantin
> 
> 
> 
> tienduc_dinh wrote:
>> Hello,
>> 
>> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and
>> 4
>> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because
>> the values "throughput" and "average IO rate" are similar, I just post
>> the
>> values of "throughput" of the same command with 3 times running
>> 
>> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048
>> -nrFiles 1
>> 
>> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95
>> 
>> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70
>> 
>> I find something strange while reading the source code. 
>> 
>> - The value of mapred.reduce.tasks is always set to 1 
>> 
>> job.setNumReduceTasks(1) in the function runIOTest()  and reduceFile =
>> new
>> Path(WRITE_DIR, "part-00000") in analyzeResult().
>> 
>> So I think, if we properly have mapred.reduce.tasks = 2, we will have on
>> the
>> file system 2 Paths to "part-00000" and "part-00001", e.g.
>> /benchmarks/TestDFSIO/io_write/part-00000
>> 
>> - And i don't understand the line with "double med = rate / 1000 /
>> tasks".
>> Is it not "double med = rate * tasks / 1000 "
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A-TestDFSIO-delivers-bad-values-of-%22throughput%22-and-%22average-IO-rate%22-tp21322404p21332803.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Mime
View raw message