hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wantao" <liu_wan...@qq.com>
Subject Re: FW: NNbench and MRBench
Date Tue, 10 May 2011 14:26:01 GMT
Hi Stanley,

Evaluating and analysing benchmark results of a system is complicated, it requires some deep
understanding of the target system. Probably you can get some hints from papers about MapReduce/Hadoop,
in which the experiment section typically provides good examples on how to evaluate test results.
Benchmakring work for other systems (e.g., file systems) is probably helpful for you as well.

Wantao
 
 
------------------ Original ------------------
From:  "Marcos Ortiz"<mlortiz@uci.cu>;
Date:  Sun, May 8, 2011 02:14 PM
To:  "stanley.shi"<stanley.shi@emc.com>; 
Cc:  "mapreduce-user"<mapreduce-user@hadoop.apache.org>; 
Subject:  Re: FW: NNbench and MRBench

 
    El 5/8/2011 12:46 AM, stanley.shi@emc.com escribió:    Thanks Marcos. This post of  Michael
Noll does provide some information about how to run these benchmarks, but there's not much
information about how to evaluate the results. Do you know some resources about the result
analysis? Thanks very much :) Regards, Stanley -----Original Message----- From: Marcos Ortiz
[mailto:mlortiz@uci.cu]  Sent: 2011年5月8日 11:09 To: mapreduce-user@hadoop.apache.org
Cc: Shi, Stanley Subject: Re: FW: NNbench and MRBench El 5/7/2011 10:33 PM, stanley.shi@emc.com
escribió:           Thanks, Marcos, Through these links, I still can't find anything about
the NNbench and MRBench. -----Original Message----- From: Marcos Ortiz [mailto:mlortiz@uci.cu]
Sent: 2011年5月8日 10:23 To: mapreduce-user@hadoop.apache.org Cc: Shi, Stanley Subject:
Re: FW: NNbench and MRBench El 5/7/2011 8:53 PM, stanley.shi@emc.com escribió:          
          Hi guys, I have a cluster of 16 machines running Hadoop. Now I want to do some benchmark
on this cluster with the "nnbench" and "mrbench". I'm new to the hadoop thing and have no
one to refer to. I don't know what the supposed result should I have? Now for mrbench, I have
an average time of 22sec for a one map job. Is this too bad? What the supposed results might
be? For nnbench, what's the supposed results? Below is my result. ================       
                      Date&   time: 2011-05-05 20:40:25,459                          
Test Operation: rename                               Start time: 2011-05-05 20:40:03,820 
                            Maps to run: 1                           Reduces to run: 1   
                   Block Size (bytes): 1                           Bytes to write: 0     
                 Bytes per checksum: 1                          Number of files: 10000   
                   Replication factor: 1               Successful file operations: 10000 
         # maps that missed the barrier: 0                             # exceptions: 0   
                          TPS: Rename: 1763               Avg Exec time (ms): Rename: 0.5672
                    Avg Lat (ms): Rename: 0.4844 null                    RAW DATA: AL Total
#1: 4844                    RAW DATA: AL Total #2: 0                 RAW DATA: TPS Total (ms):
5672          RAW DATA: Longest Map Time (ms): 5672.0                      RAW DATA: Late
maps: 0                RAW DATA: # of exceptions: 0 ============================= One more
question, when I set maps number to bigger, I get all zeros results: =============================
Test Operation: create_write                               Start time: 2011-05-03 23:22:39,239
                             Maps to run: 160                           Reduces to run: 160
                      Block Size (bytes): 1                           Bytes to write: 0  
                    Bytes per checksum: 1                          Number of files: 1    
                  Replication factor: 1               Successful file operations: 0      
    # maps that missed the barrier: 0                             # exceptions: 0        
         TPS: Create/Write/Close: 0 Avg exec time (ms): Create/Write/Close: 0.0          
    Avg Lat (ms): Create/Write: NaN                      Avg Lat (ms): Close: NaN        
           RAW DATA: AL Total #1: 0                    RAW DATA: AL Total #2: 0          
      RAW DATA: TPS Total (ms): 0          RAW DATA: Longest Map Time (ms): 0.0          
           RAW DATA: Late maps: 0                RAW DATA: # of exceptions: 0 =====================
Can anyone point me to some documents? I really appreciate your help :) Thanks, stanley  
                    You can use these resources: http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/
http://answers.oreilly.com/topic/460-how-to-benchmark-a-hadoop-cluster/ http://wiki.apache.org/hadoop/HardwareBenchmarks
http://www.quora.com/Apache-Hadoop/Are-there-any-good-Hadoop-benchmark-problems Regards  
            Well, on the Micheal Noll's post says this: NameNode benchmark (nnbench) =======================
NNBench (see src/test/org/apache/hadoop/hdfs/NNBench.java) is useful for  load testing the
NameNode hardware and configuration. It generates a lot  of HDFS-related requests with normally
very small "payloads" for the  sole purpose of putting a high HDFS management stress on the
NameNode.  The benchmark can simulate requests for creating, reading, renaming and  deleting
files on HDFS. I like to run this test simultaneously from several machines -- e.g.  from
a set of DataNode boxes -- in order to hit the NameNode from  multiple locations at the same
time. The syntax of NNBench is as follows: NameNode Benchmark 0.4 Usage: nnbench <options>
Options:          -operation <Available operations are create_write open_read  rename delete.
This option is mandatory>           * NOTE: The open_read, rename and delete operations
assume  that the files they operate on, are already available. The create_write  operation
must be run before running the other operations.          -maps <number of maps. default
is 1. This is not mandatory>          -reduces <number of reduces. default is 1. This
is not mandatory>          -startTime <time to start, given in seconds from the epoch.
 Make sure this is far enough into the future, so all maps (operations)  will start at the
same time>. default is launch time + 2 mins. This is  not mandatory          -blockSize
<Block size in bytes. default is 1. This is not  mandatory>          -bytesToWrite <Bytes
to write. default is 0. This is not mandatory>          -bytesPerChecksum <Bytes per
checksum for the files. default is  1. This is not mandatory>          -numberOfFiles <number
of files to create. default is 1. This  is not mandatory>          -replicationFactorPerFile
<Replication factor for the files.  default is 1. This is not mandatory>          -baseDir
<base DFS path. default is /becnhmarks/NNBench. This  is not mandatory>          -readFileAfterOpen
<true or false. if true, it reads the file  and reports the average time to read. This
is valid with the open_read  operation. default is false. This is not mandatory>      
   -help: Display the help statement The following command will run a NameNode benchmark that
creates 1000  files using 12 maps and 6 reducers. It uses a custom output directory  based
on the machine's short hostname. This is a simple trick to ensure  that one box does not accidentally
write into the same output directory  of another box running NNBench at the same time. $ hadoop
jar hadoop-*-test.jar nnbench -operation create_write \      -maps 12 -reduces 6 -blockSize
1 -bytesToWrite 0 -numberOfFiles 1000 \      -replicationFactorPerFile 3 -readFileAfterOpen
true \      -baseDir /benchmarks/NNBench-`hostname -s` Note that by default the benchmark
waits 2 minutes before it actually  starts! MapReduce benchmark (mrbench) =======================
MRBench (see src/test/org/apache/hadoop/mapred/MRBench.java) loops a  small job a number of
times. As such it is a very complimentary  benchmark to the "large-scale" TeraSort benchmark
suite because MRBench  checks whether small job runs are responsive and running efficiently
on  your cluster. It puts its focus on the MapReduce layer as its impact on  the HDFS layer
is very limited. This test should be run from a single box (see caveat below). The  command
syntax can be displayed via mrbench --help: MRBenchmark.0.0.2 Usage: mrbench [-baseDir ] 
          [-jar ]            [-numRuns ]            [-maps ]            [-reduces ]      
     [-inputLines ]            [-inputType ]            [-verbose]      Important note: In
Hadoop 0.20.2, setting the -baseDir parameter  has no effect. This means that multiple parallel
MRBench runs (e.g.  started from different boxes) might interfere with each other. This is
a  known bug (MAPREDUCE-2398). I have submitted a patch but it has not been  integrated yet.
In Hadoop 0.20.2, the parameters default to: -baseDir: /benchmarks/MRBench  [*** see my note
above ***] -numRuns: 1 -maps: 2 -reduces: 1 -inputLines: 1 -inputType: ascending The command
to run a loop of 50 small test jobs is: $ hadoop jar hadoop-*-test.jar mrbench -numRuns 50
Exemplary output of the above command: DataLines       Maps    Reduces AvgTime (milliseconds)
1               2       1       31414 This means that the average finish time of executed
jobs was 31 seconds. Can you check this? http://www.slideshare.net/ydn/ahis2011-platform-hadoop-simulation-and-performance
http://issues.apache.org/jira/browse/HADOOP-5867 Did you search on the current documentation
of the API? Regards     Ok, I understand.
 Let me try to help you, because I'm a newie on the Hadoop ecosystem.
 Tom White on its answer to this topic on the OReilly Answers's Site does a introduction to
this:
 
  
The following command writes 10 files of 1,000 MB each:
 % hadoop jar $HADOOP_INSTALL/hadoop-*-test.jar TestDFSIO -write -nrFiles 10 -fileSize 1000
 
At the end of the run, the results are written to the console and also recorded in a local
file (which is appended to, so you can rerun the benchmark and not lose old results):
 % cat TestDFSIO_results.log ----- TestDFSIO ----- : write            Date & time: Sun
Apr 12 07:14:09 EDT 2009        Number of files: 10 Total MBytes processed: 10000      Throughput
mb/sec: 7.796340865378244 Average IO rate mb/sec: 7.8862199783325195  IO rate std deviation:
0.9101254683525547     Test exec time sec: 163.387  
The files are written under the /benchmarks/TestDFSIO directory by default (this can be changed
by setting the test.build.data system property), in a directory called io_data.
 
To run a read benchmark, use the -read argument. Note that these files must already exist
(having been written by TestDFSIO -write):
 % hadoop jar $HADOOP_INSTALL/hadoop-*-test.jar TestDFSIO -read -nrFiles 10  -fileSize 1000
 
Here are the results for a real run:
 ----- TestDFSIO ----- : read            Date & time: Sun Apr 12 07:24:28 EDT 2009   
    Number of files: 10 Total MBytes processed: 10000      Throughput mb/sec: 80.25553361904304
Average IO rate mb/sec: 98.6801528930664  IO rate std deviation: 36.63507598174921 -----------------------------------------
    Test exec time sec: 47.624  
When you’ve finished benchmarking, you can delete all the generated files from HDFS using
the -clean argument:
 % hadoop jar $HADOOP_INSTALL/hadoop-*-test.jar TestDFSIO -clean 
 You can see that all results are written to the TestDFSIO_results.log.
 
 So, you can begin to experiment with this.
  You can continue this reading on the Chapter 9 of the Hadoop: The Definitive Guide 2nd Edition,
on the topic: Benchmarking a Hadoop Cluster.
 
 In it, Tom gives several advices to benchmark  a Hadoop Cluster:
 
 - Use a cluster that is not been used by others
 - One of the primary test that one should do is a intensive I/O benchmark, to prove the cluster
before it goes live to production
 - Write benchmarks with Gridmix (Check this http://developer.yahoo.net/blogs/hadoop/2010/04/gridmix3_emulating_production.html)
 
 
 Well, I hope that this information could help you. Remember, I've worked with Hadoop only
for 1 year, so, you can ask for advices to others colleagues too.
 
 Regards --  Marcos Luís Ortíz Valmaseda  Software Engineer (Large-Scaled Distributed Systems)
 University of Information Sciences,  La Habana, Cuba  Linux User # 418229  http://about.me/marcosortiz
Mime
View raw message