hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ray Chiang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6321) Map tasks take a lot of time to start up
Date Fri, 01 May 2015 23:11:07 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14524254#comment-14524254
] 

Ray Chiang commented on MAPREDUCE-6321:
---------------------------------------

Task startup time includes scheduling determination delays, which is what YARN-2990 fixes.
 Localization and JVM startup are usually a noticeable chunk of the remaining time.

> Map tasks take a lot of time to start up
> ----------------------------------------
>
>                 Key: MAPREDUCE-6321
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6321
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 2.6.0
>            Reporter: Rajat Jain
>            Priority: Critical
>              Labels: performance
>
> I have noticed repeatedly that the map tasks take a lot of time to startup on YARN clusters.
This is not the scheduling part, this is after the actual container is launched containing
the Map task. Take for example, the sample log from a mapper of a Pi job that I launched.
The command I used to launch the Pi job was:
> {code}
> hadoop jar /usr/lib/hadoop/share/hadoop/mapreduce/hadoop*mapreduce*examples*jar pi 10
100
> {code}
> This is the sample job from one of the mappers which took 14 seconds to complete. If
you notice from the logs, most of the time taken by this job is during the start up. I notice
that the most mappers take anywhere between 7 to 15 seconds during start up and have seen
this behavior consistent across mapreduce jobs. This really affects the performance of short
running mappers.
> I run a hadoop2 / yarn cluster on a 4-5 node m1.xlarge cluster, and the mapper memory
is always specified as 2048m and so on.
> Log:
> {code}
> 2015-04-18 06:48:34,081 INFO [main] org.apache.hadoop.metrics2.impl.MetricsConfig: loaded
properties from hadoop-metrics2.properties
> 2015-04-18 06:48:34,637 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Scheduled snapshot period at 10 second(s).
> 2015-04-18 06:48:34,637 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
MapTask metrics system started
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with
tokens:
> 2015-04-18 06:48:34,690 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job,
Service: job_1429338752209_0059, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@5d48e5d6)
> 2015-04-18 06:48:35,391 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for
0ms before retrying again. Got null now.
> 2015-04-18 06:48:36,656 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir
for child: /media/ephemeral3/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral1/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059,/media/ephemeral0/yarn/local/usercache/rjain/appcache/application_1429338752209_0059
> 2015-04-18 06:48:36,706 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:37,387 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:39,388 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
session.id is deprecated. Instead, use dfs.metrics.session-id
> 2015-04-18 06:48:39,448 INFO [main] org.apache.hadoop.conf.Configuration.deprecation:
fs.default.name is deprecated. Instead, use fs.defaultFS
> 2015-04-18 06:48:41,060 INFO [main] org.apache.hadoop.fs.s3native.NativeS3FileSystem:
setting Progress to org.apache.hadoop.mapred.Task$TaskReporter@601211d0 comment setting up
progress from Task
> 2015-04-18 06:48:41,098 INFO [main] org.apache.hadoop.mapred.Task:  Using ResourceCalculatorProcessTree
: [ ]
> 2015-04-18 06:48:41,585 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split:
hdfs://ec2-54-211-109-245.compute-1.amazonaws.com:9000/user/rjain/QuasiMonteCarlo_1429339685772_504558444/in/part4:0+118
> 2015-04-18 06:48:43,926 INFO [main] org.apache.hadoop.mapred.MapTask: (EQUATOR) 0 kvi
234881020(939524080)
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: mapreduce.task.io.sort.mb:
896
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: soft limit at 657666880
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufvoid
= 939524096
> 2015-04-18 06:48:43,927 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 234881020;
length = 58720256
> 2015-04-18 06:48:43,946 INFO [main] org.apache.hadoop.mapred.MapTask: Map output collector
class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush
of map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend
= 18; bufvoid = 939524096
> 2015-04-18 06:48:44,022 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 234881020(939524080);
kvend = 234881016(939524064); length = 5/58720256
> 2015-04-18 06:48:44,065 INFO [main] org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector:
Returning directory: /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/attempt_1429338752209_0059_m_000004_0_spill_0.out
> 2015-04-18 06:48:44,089 INFO [main] org.apache.hadoop.io.compress.CodecPool: Got brand-new
compressor [.snappy]
> 2015-04-18 06:48:44,100 INFO [main] org.apache.hadoop.mapred.MapTask: Finished spill
0
> 2015-04-18 06:48:44,111 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1429338752209_0059_m_000004_0
is done. And is in the process of committing
> 2015-04-18 06:48:44,164 INFO [main] org.apache.hadoop.fs.s3native.NativeS3FileSystem:
setting Progress to null comment clearing progress now that task is done
> 2015-04-18 06:48:44,191 INFO [main] org.apache.hadoop.mapred.Task: Counters for attempt_1429338752209_0059_m_000004_0
> 2015-04-18 06:48:44,203 INFO [main] org.apache.hadoop.mapred.Task: File System Counters
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes
read = 0
> 2015-04-18 06:48:44,206 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes
written = 116166
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of read
operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of large
read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of write
operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes
read = 297
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes
written = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of read
operations = 4
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of large
read operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of write
operations = 0
> 2015-04-18 06:48:44,207 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce Framework
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map input records
= 1
> 2015-04-18 06:48:44,208 INFO [main] org.apache.hadoop.mapred.Task: Map output records
= 2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output bytes =
18
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Map output materialized
bytes = 34
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Input split bytes
= 179
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Combine input records
= 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Spilled Records =
2
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Failed Shuffles =
0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: Merged Map outputs
= 0
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: GC time elapsed (ms)
= 158
> 2015-04-18 06:48:44,209 INFO [main] org.apache.hadoop.mapred.Task: CPU time spent (ms)
= 2080
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Physical memory (bytes)
snapshot = 1104715776
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Virtual memory (bytes)
snapshot = 2900013056
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Total committed heap
usage (bytes) = 1251475456
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File Input Format
Counters 
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: Bytes Read = 118
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: File System Counters
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes
read = 0
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of bytes
written = 116166
> 2015-04-18 06:48:44,210 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of read
operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of large
read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: FILE: Number of write
operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes
read = 297
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of bytes
written = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of read
operations = 4
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of large
read operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: HDFS: Number of write
operations = 0
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map-Reduce Framework
> 2015-04-18 06:48:44,211 INFO [main] org.apache.hadoop.mapred.Task: Map input records
= 1
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output records
= 2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output bytes =
18
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Map output materialized
bytes = 34
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Input split bytes
= 179
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Combine input records
= 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Spilled Records =
2
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Failed Shuffles =
0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: Merged Map outputs
= 0
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: GC time elapsed (ms)
= 158
> 2015-04-18 06:48:44,212 INFO [main] org.apache.hadoop.mapred.Task: CPU time spent (ms)
= 2080
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Physical memory (bytes)
snapshot = 1104715776
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Virtual memory (bytes)
snapshot = 2900013056
> 2015-04-18 06:48:44,213 INFO [main] org.apache.hadoop.mapred.Task: Total committed heap
usage (bytes) = 1251475456
> 2015-04-18 06:48:44,214 INFO [main] org.apache.hadoop.mapred.Task: File path: /media/ephemeral2/yarn/local/usercache/rjain/appcache/application_1429338752209_0059/output/attempt_1429338752209_0059_m_000004_0/file.out
> 2015-04-18 06:48:44,223 INFO [main] org.apache.hadoop.mapred.Task: Task 'attempt_1429338752209_0059_m_000004_0'
done.
> 2015-04-18 06:48:44,225 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
Stopping MapTask metrics system...
> 2015-04-18 06:48:44,231 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
MapTask metrics system stopped.
> 2015-04-18 06:48:44,233 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
MapTask metrics system shutdown complete.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message