Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 46710 invoked from network); 15 Jan 2009 03:27:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Jan 2009 03:27:35 -0000 Received: (qmail 46646 invoked by uid 500); 15 Jan 2009 03:27:29 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 46606 invoked by uid 500); 15 Jan 2009 03:27:29 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 46595 invoked by uid 99); 15 Jan 2009 03:27:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Jan 2009 19:27:29 -0800 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [69.147.107.20] (HELO mrout1-b.corp.re1.yahoo.com) (69.147.107.20) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jan 2009 03:27:20 +0000 Received: from [10.72.185.127] (gentlepaint-lx.corp.yahoo.com [10.72.185.127]) by mrout1-b.corp.re1.yahoo.com (8.13.8/8.13.8/y.out) with ESMTP id n0F3QQk1043502 for ; Wed, 14 Jan 2009 19:26:26 -0800 (PST) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:subject: references:in-reply-to:content-type:content-transfer-encoding; b=pFOzxg4aEc7vJGEattPgQKKeKE7BULN8rx9LTahSIsbzLd3ZwnZ0ziXg7ZuJjnrX Message-ID: <496EACE2.2090007@yahoo-inc.com> Date: Wed, 14 Jan 2009 19:26:26 -0800 From: Konstantin Shvachko User-Agent: Thunderbird 2.0.0.19 (Windows/20081209) MIME-Version: 1.0 To: core-user@hadoop.apache.org Subject: Re: TestDFSIO delivers bad values of "throughput" and "average IO rate" References: <4963F415.9030508@yahoo-inc.com> <21399409.post@talk.nabble.com> In-Reply-To: <21399409.post@talk.nabble.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org In TestDFSIO we want each task to create only one file. It is a one-to-one mapping from files to map tasks. And splits are defined so that each map gets only one file name, which it creates or reads. --Konstantin tienduc_dinh wrote: > I don't understand, why the parameter -nrFiles of TestDFSIO should override > mapred.map.tasks. > nrFiles is the number of the files which will be created and > mapred.map.tasks is the number how many splits will be done by the input > file. > > Thanks > > > Konstantin Shvachko wrote: >> Hi tienduc_dinh, >> >> Just a bit of a background, which should help to answer your questions. >> TestDFSIO mappers perform one operation (read or write) each, measure >> the time taken by the operation and output the following three values: >> (I am intentionally omitting some other output stuff.) >> - size(i) >> - time(i) >> - rate(i) = size(i) / time(i) >> i is the index of the map task 0 <= i < N, and N is the "-nrFiles" value, >> which equals the number of maps. >> >> Then the reduce sums those values and writes them into "part-00000". >> That is you get three fields in it >> size = size(0) + ... + size(N-1) >> time = time(0) + ... + time(N-1) >> rate = rate(0) + ... + rate(N-1) >> >> Then we calculate >> throughput = size / time >> averageIORate = rate / N >> >> So answering your questions >> - There should be only one reduce task, otherwise you will have to >> manually sum corresponding values in "part-00000" and "part-00001". >> - The value of the ":rate" after the reduce equals the sum of individual >> rates of each operation. So if you want to have an average you should >> divide it by the number tasks rather than multiply. >> >> Now, in your case you create only one file "-nrFiles 1", which means >> you run only one map task. >> Setting "mapred.map.tasks" to 10 in hadoop-site.xml defines the default >> number of tasks per job. See here >> http://hadoop.apache.org/core/docs/current/hadoop-default.html#mapred.map.tasks >> In case of TestDFSIO it will be overridden by "-nrFiles". >> >> Hope this answers your questions. >> Thanks, >> --Konstantin >> >> >> >> tienduc_dinh wrote: >>> Hello, >>> >>> I'm now using hadoop-0.18.0 and testing it on a cluster with 1 master and >>> 4 >>> slaves. In hadoop-site.xml the value of "mapred.map.tasks" is 10. Because >>> the values "throughput" and "average IO rate" are similar, I just post >>> the >>> values of "throughput" of the same command with 3 times running >>> >>> - > hadoop-0.18.0/bin/hadoop jar testDFSIO.jar -write -fileSize 2048 >>> -nrFiles 1 >>> >>> + with "dfs.replication = 1" => 33,60 / 31,48 / 30,95 >>> >>> + with "dfs.replication = 2" => 26,40 / 20,99 / 21,70 >>> >>> I find something strange while reading the source code. >>> >>> - The value of mapred.reduce.tasks is always set to 1 >>> >>> job.setNumReduceTasks(1) in the function runIOTest() and reduceFile = >>> new >>> Path(WRITE_DIR, "part-00000") in analyzeResult(). >>> >>> So I think, if we properly have mapred.reduce.tasks = 2, we will have on >>> the >>> file system 2 Paths to "part-00000" and "part-00001", e.g. >>> /benchmarks/TestDFSIO/io_write/part-00000 >>> >>> - And i don't understand the line with "double med = rate / 1000 / >>> tasks". >>> Is it not "double med = rate * tasks / 1000 " >> >