hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kawa <kawa.a...@gmail.com>
Subject Re: Hadoop Pi Example in Yarn
Date Wed, 18 Dec 2013 21:15:46 GMT
A map task is created for each input split in your dataset. By default, an
input split correlates to block in HDFS i.e. if a file consists of 1 HDFS
block, then 1 map task will be started - if a file consists of N blocks,
then N map task will be started for that file (obviously, assuming a
default settings).

PiEstimator generates input files for itself. When you submit PiEstimator
job, you need to specify how many map tasks you want to run. Then, before
submitting a job to the cluster, it will generate a this number of input
files in HDFS. For each file map task will be started. What is interesting
each file, will contain a single line only.

You can see some code here
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#PiEstimator

278 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#278>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

      //generate an input file for each map task

279 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#279>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

      for(int i=0; i < numMaps; ++i) {

280 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#280>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        final Path
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/fs/Path.java#Path>
file = new Path
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/fs/Path.java#Path>(inDir,
"part"+i);

281 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#281>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        final LongWritable
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/LongWritable.java#LongWritable>
offset = new LongWritable
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/LongWritable.java#LongWritable>(i
* numPoints);

282 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#282>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        final LongWritable
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/LongWritable.java#LongWritable>
size = new LongWritable
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/LongWritable.java#LongWritable>(numPoints);

283 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#283>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        final SequenceFile
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/SequenceFile.java#SequenceFile.Writer>.Writer
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/SequenceFile.java#SequenceFile.Writer>
writer = SequenceFile.createWriter
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/SequenceFile.java#SequenceFile.createWriter%28org.apache.hadoop.fs.FileSystem%2Corg.apache.hadoop.conf.Configuration%2Corg.apache.hadoop.fs.Path%2Cjava.lang.Class%2Cjava.lang.Class%2Corg.apache.hadoop.io.SequenceFile.CompressionType%29>(

284 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#284>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

            fs, jobConf, file,

285 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#285>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

            LongWritable
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/LongWritable.java#LongWritable>.class,
LongWritable <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/LongWritable.java#LongWritable>.class,
CompressionType
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/SequenceFile.java#SequenceFile.CompressionType.0NONE>.NONE
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/SequenceFile.java#SequenceFile.CompressionType.0NONE>);

286 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#286>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        try {

287 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#287>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

          writer.append
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/SequenceFile.java#SequenceFile.Writer.append%28org.apache.hadoop.io.Writable%2Corg.apache.hadoop.io.Writable%29>(offset,
size);

288 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#288>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        } finally {

289 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#289>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

          writer.close
<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-core/0.20.2-cdh3u1/org/apache/hadoop/io/SequenceFile.java#SequenceFile.Writer.close%28%29>();

290 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#290>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        }

291 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#291>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

        System <http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/System.java#System.0out>.out
<http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/System.java#System.0out>.println
<http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/io/PrintStream.java#PrintStream.println%28java.lang.String%29>("Wrote
input for Map #"+i);

292 <http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#292>

<http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-examples/0.20.2-cdh3u1/org/apache/hadoop/examples/PiEstimator.java#>

      }



2013/12/18 - <commodore65@ymail.com>

> How does the PI example can determine the number of mappers?
> I thought the only way to determine number of mappers is via the amount of
> filesplits you have in the input file...
> So for instance if the input size is 100MB and filesplit size is 20MB then
> I would expect to have 100/20 = 5 map tasks.
>
> Thanks
>
>

Mime
View raw message