hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jian Fang <jian.fang.subscr...@gmail.com>
Subject Re: Why my tests shows Yarn is worse than MRv1 for terasort?
Date Wed, 23 Oct 2013 16:20:02 GMT
Here is one run for MR1. Thanks.

2013-10-22 08:59:43,799 INFO org.apache.hadoop.mapred.JobClient (main):
Counters: 31
2013-10-22 08:59:43,800 INFO org.apache.hadoop.mapred.JobClient (main):
Job Counters
2013-10-22 08:59:43,800 INFO org.apache.hadoop.mapred.JobClient (main):
Launched reduce tasks=127
2013-10-22 08:59:43,800 INFO org.apache.hadoop.mapred.JobClient (main):
SLOTS_MILLIS_MAPS=173095295
2013-10-22 08:59:43,800 INFO org.apache.hadoop.mapred.JobClient (main):
Total time spent by all reduces waiting after reserving slots (ms)=0
2013-10-22 08:59:43,800 INFO org.apache.hadoop.mapred.JobClient (main):
Total time spent by all maps waiting after reserving slots (ms)=0
2013-10-22 08:59:43,801 INFO org.apache.hadoop.mapred.JobClient (main):
Rack-local map tasks=13
2013-10-22 08:59:43,801 INFO org.apache.hadoop.mapred.JobClient (main):
Launched map tasks=7630
2013-10-22 08:59:43,801 INFO org.apache.hadoop.mapred.JobClient (main):
Data-local map tasks=7617
2013-10-22 08:59:43,801 INFO org.apache.hadoop.mapred.JobClient (main):
SLOTS_MILLIS_REDUCES=125929437
2013-10-22 08:59:43,801 INFO org.apache.hadoop.mapred.JobClient (main):
File Input Format Counters
2013-10-22 08:59:43,801 INFO org.apache.hadoop.mapred.JobClient (main):
Bytes Read=1000478157952
2013-10-22 08:59:43,801 INFO org.apache.hadoop.mapred.JobClient (main):
File Output Format Counters
2013-10-22 08:59:43,802 INFO org.apache.hadoop.mapred.JobClient (main):
Bytes Written=1000000000000
2013-10-22 08:59:43,802 INFO org.apache.hadoop.mapred.JobClient (main):
FileSystemCounters
2013-10-22 08:59:43,802 INFO org.apache.hadoop.mapred.JobClient (main):
FILE_BYTES_READ=550120627113
2013-10-22 08:59:43,802 INFO org.apache.hadoop.mapred.JobClient (main):
HDFS_BYTES_READ=1000478910352
2013-10-22 08:59:43,802 INFO org.apache.hadoop.mapred.JobClient (main):
FILE_BYTES_WRITTEN=725563889045
2013-10-22 08:59:43,802 INFO org.apache.hadoop.mapred.JobClient (main):
HDFS_BYTES_WRITTEN=1000000000000
2013-10-22 08:59:43,802 INFO org.apache.hadoop.mapred.JobClient (main):
Map-Reduce Framework
2013-10-22 08:59:43,803 INFO org.apache.hadoop.mapred.JobClient (main):
Map output materialized bytes=240948272514
2013-10-22 08:59:43,803 INFO org.apache.hadoop.mapred.JobClient (main):
Map input records=10000000000
2013-10-22 08:59:43,803 INFO org.apache.hadoop.mapred.JobClient (main):
Reduce shuffle bytes=240948272514
2013-10-22 08:59:43,803 INFO org.apache.hadoop.mapred.JobClient (main):
Spilled Records=30000000000
2013-10-22 08:59:43,803 INFO org.apache.hadoop.mapred.JobClient (main):
Map output bytes=1000000000000
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Total committed heap usage (bytes)=4841182068736
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
CPU time spent (ms)=118779050
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Map input bytes=1000000000000
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
SPLIT_RAW_BYTES=752400
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Combine input records=0
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Reduce input records=10000000000
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Reduce input groups=4294967296
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Combine output records=0
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Physical memory (bytes) snapshot=5092578041856
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Reduce output records=10000000000
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Virtual memory (bytes) snapshot=9391598915584
2013-10-22 08:59:43,804 INFO org.apache.hadoop.mapred.JobClient (main):
Map output records=10000000000



On Wed, Oct 23, 2013 at 8:17 AM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> How many map and reduce slots are you using per tasktracker in MR1?  How
> do the average map times compare? (MR2 reports this directly on the web UI,
> but you can also get a sense in MR1 by scrolling through the map tasks
> page).  Can you share the counters for MR1?
>
> -Sandy
>
>
> On Wed, Oct 23, 2013 at 12:23 AM, Jian Fang <jian.fang.subscribe@gmail.com
> > wrote:
>
>> Unfortunately, turning off JVM reuse still got the same result, i.e.,
>> about 90 minutes for MR2. I don't think the killed reduces could contribute
>> to 2 times slowness. There should be something very wrong either in
>> configuration or code. Any hints?
>>
>>
>>
>> On Tue, Oct 22, 2013 at 5:50 PM, Jian Fang <jian.fang.subscribe@gmail.com
>> > wrote:
>>
>>> Thanks Sandy. I will try to turn JVM resue off and see what happens.
>>>
>>> Yes, I saw quite some exceptions in the task attempts. For instance.
>>>
>>>
>>> 2013-10-20 03:13:58,751 ERROR [main]
>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>> as:hadoop (auth:SIMPLE) cause:java.nio.channels.ClosedChannelException
>>> 2013-10-20 03:13:58,752 ERROR [Thread-6]
>>> org.apache.hadoop.hdfs.DFSClient: Failed to close file
>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200
>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>>> No lease on
>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200:
>>> File does not exist. Holder
>>> DFSClient_attempt_1382237301855_0001_m_000200_1_872378586_1 does not have
>>> any open files.
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2737)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2801)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2783)
>>>         at
>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:611)
>>>         at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:429)
>>>         at
>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48077)
>>>         at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:582)
>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>>> --
>>>         at com.sun.proxy.$Proxy10.complete(Unknown Source)
>>>         at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:371)
>>>         at
>>> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1910)
>>>         at
>>> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1896)
>>>         at
>>> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:773)
>>>         at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:790)
>>>         at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:847)
>>>         at
>>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2526)
>>>         at
>>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2551)
>>>         at
>>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>>> 2013-10-20 03:13:58,753 WARN [main] org.apache.hadoop.mapred.YarnChild:
>>> Exception running child : java.nio.channels.ClosedChannelException
>>>         at
>>> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1325)
>>>         at
>>> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98)
>>>         at
>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:61)
>>>         at java.io.DataOutputStream.write(DataOutputStream.java:107)
>>>         at
>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:69)
>>>         at
>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:57)
>>>         at
>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:646)
>>>         at
>>> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>>>         at
>>> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>>>         at
>>> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:230)
>>>
>>>
>>>
>>> On Tue, Oct 22, 2013 at 4:45 PM, Sandy Ryza <sandy.ryza@cloudera.com>wrote:
>>>
>>>> It looks like many of your reduce tasks were killed.  Do you know why?
>>>>  Also, MR2 doesn't have JVM reuse, so it might make sense to compare it to
>>>> MR1 with JVM reuse turned off.
>>>>
>>>> -Sandy
>>>>
>>>>
>>>> On Tue, Oct 22, 2013 at 3:06 PM, Jian Fang <
>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>
>>>>> The Terasort output for MR2 is as follows.
>>>>>
>>>>> 2013-10-22 21:40:16,261 INFO org.apache.hadoop.mapreduce.Job (main):
>>>>> Counters: 46
>>>>>         File System Counters
>>>>>                 FILE: Number of bytes read=456102049355
>>>>>                 FILE: Number of bytes written=897246250517
>>>>>                 FILE: Number of read operations=0
>>>>>                 FILE: Number of large read operations=0
>>>>>                 FILE: Number of write operations=0
>>>>>                 HDFS: Number of bytes read=1000000851200
>>>>>                 HDFS: Number of bytes written=1000000000000
>>>>>                 HDFS: Number of read operations=32131
>>>>>                 HDFS: Number of large read operations=0
>>>>>                 HDFS: Number of write operations=224
>>>>>         Job Counters
>>>>>                 Killed map tasks=1
>>>>>                 Killed reduce tasks=20
>>>>>                 Launched map tasks=7601
>>>>>                 Launched reduce tasks=132
>>>>>                 Data-local map tasks=7591
>>>>>                 Rack-local map tasks=10
>>>>>                 Total time spent by all maps in occupied slots
>>>>> (ms)=1696141311
>>>>>                 Total time spent by all reduces in occupied slots
>>>>> (ms)=2664045096
>>>>>         Map-Reduce Framework
>>>>>                 Map input records=10000000000
>>>>>                 Map output records=10000000000
>>>>>                 Map output bytes=1020000000000
>>>>>                 Map output materialized bytes=440486356802
>>>>>                 Input split bytes=851200
>>>>>                 Combine input records=0
>>>>>                 Combine output records=0
>>>>>                 Reduce input groups=10000000000
>>>>>                 Reduce shuffle bytes=440486356802
>>>>>                 Reduce input records=10000000000
>>>>>                 Reduce output records=10000000000
>>>>>                 Spilled Records=20000000000
>>>>>                 Shuffled Maps =851200
>>>>>                 Failed Shuffles=61
>>>>>                 Merged Map outputs=851200
>>>>>                 GC time elapsed (ms)=4215666
>>>>>                 CPU time spent (ms)=192433000
>>>>>                 Physical memory (bytes) snapshot=3349356380160
>>>>>                 Virtual memory (bytes) snapshot=9665208745984
>>>>>                 Total committed heap usage (bytes)=3636854259712
>>>>>         Shuffle Errors
>>>>>                 BAD_ID=0
>>>>>                 CONNECTION=0
>>>>>                 IO_ERROR=4
>>>>>                 WRONG_LENGTH=0
>>>>>                 WRONG_MAP=0
>>>>>                 WRONG_REDUCE=0
>>>>>         File Input Format Counters
>>>>>                 Bytes Read=1000000000000
>>>>>         File Output Format Counters
>>>>>                 Bytes Written=1000000000000
>>>>>
>>>>> Thanks,
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 22, 2013 at 2:44 PM, Jian Fang <
>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have the same problem. I compared Hadoop 2.2.0 with Hadoop 1.0.3
>>>>>> and it turned out that the terasort for MR2 is 2 times slower than that in
>>>>>> MR1. I cannot really believe it.
>>>>>>
>>>>>> The cluster has 20 nodes with 19 data nodes.  My Hadoop 2.2.0 cluster
>>>>>> configurations are as follows.
>>>>>>
>>>>>>         mapreduce.map.java.opts = "-Xmx512m";
>>>>>>         mapreduce.reduce.java.opts = "-Xmx1536m";
>>>>>>         mapreduce.map.memory.mb = "768";
>>>>>>         mapreduce.reduce.memory.mb = "2048";
>>>>>>
>>>>>>         yarn.scheduler.minimum-allocation-mb = "256";
>>>>>>         yarn.scheduler.maximum-allocation-mb = "8192";
>>>>>>         yarn.nodemanager.resource.memory-mb = "12288";
>>>>>>         yarn.nodemanager.resource.cpu-vcores = "16";
>>>>>>
>>>>>>         mapreduce.reduce.shuffle.parallelcopies = "20";
>>>>>>         mapreduce.task.io.sort.factor = "48";
>>>>>>         mapreduce.task.io.sort.mb = "200";
>>>>>>         mapreduce.map.speculative = "true";
>>>>>>         mapreduce.reduce.speculative = "true";
>>>>>>         mapreduce.framework.name = "yarn";
>>>>>>         yarn.app.mapreduce.am.job.task.listener.thread-count = "60";
>>>>>>         mapreduce.map.cpu.vcores = "1";
>>>>>>         mapreduce.reduce.cpu.vcores = "2";
>>>>>>
>>>>>>         mapreduce.job.jvm.numtasks = "20";
>>>>>>         mapreduce.map.output.compress = "true";
>>>>>>         mapreduce.map.output.compress.codec =
>>>>>> "org.apache.hadoop.io.compress.SnappyCodec";
>>>>>>
>>>>>>         yarn.resourcemanager.client.thread-count = "64";
>>>>>>         yarn.resourcemanager.scheduler.client.thread-count = "64";
>>>>>>         yarn.resourcemanager.resource-tracker.client.thread-count =
>>>>>> "64";
>>>>>>         yarn.resourcemanager.scheduler.class =
>>>>>> "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler";
>>>>>>         yarn.nodemanager.aux-services = "mapreduce_shuffle";
>>>>>>         yarn.nodemanager.aux-services.mapreduce.shuffle.class =
>>>>>> "org.apache.hadoop.mapred.ShuffleHandler";
>>>>>>         yarn.nodemanager.vmem-pmem-ratio = "5";
>>>>>>         yarn.nodemanager.container-executor.class =
>>>>>> "org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor";
>>>>>>         yarn.nodemanager.container-manager.thread-count = "64";
>>>>>>         yarn.nodemanager.localizer.client.thread-count = "20";
>>>>>>         yarn.nodemanager.localizer.fetch.thread-count = "20";
>>>>>>
>>>>>> My Hadoop 1.0.3 has the same memory/disks/cores and almost the same
>>>>>> other configurations. In MR1, the 1TB terasort took about 45 minutes, but
>>>>>> it took around 90 minutes in MR2.
>>>>>>
>>>>>> Does anyone know what is wrong here? Or do I need some special
>>>>>> configurations for terasort to work better in MR2?
>>>>>>
>>>>>> Thanks in advance,
>>>>>>
>>>>>> John
>>>>>>
>>>>>>
>>>>>> On Tue, Jun 18, 2013 at 3:11 AM, Michel Segel <
>>>>>> michael_segel@hotmail.com> wrote:
>>>>>>
>>>>>>> Sam,
>>>>>>> I think your cluster is too small for any meaningful conclusions to
>>>>>>> be made.
>>>>>>>
>>>>>>>
>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>>
>>>>>>> Mike Segel
>>>>>>>
>>>>>>> On Jun 18, 2013, at 3:58 AM, sam liu <samliuhadoop@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi Harsh,
>>>>>>>
>>>>>>> Thanks for your detailed response! Now, the efficiency of my Yarn
>>>>>>> cluster improved a lot after increasing the reducer
>>>>>>> number(mapreduce.job.reduces) in mapred-site.xml. But I still have some
>>>>>>> questions about the way of Yarn to execute MRv1 job:
>>>>>>>
>>>>>>> 1.In Hadoop 1.x, a job will be executed by map task and reduce task
>>>>>>> together, with a typical process(map > shuffle > reduce). In Yarn, as I
>>>>>>> know, a MRv1 job will be executed only by ApplicationMaster.
>>>>>>> - Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1 job
>>>>>>> has special execution process(map > shuffle > reduce) in Hadoop 1.x, and
>>>>>>> how Yarn execute a MRv1 job? still include some special MR steps in Hadoop
>>>>>>> 1.x, like map, sort, merge, combine and shuffle?
>>>>>>> - Do the MRv1 parameters still work for Yarn? Like
>>>>>>> mapreduce.task.io.sort.mb and mapreduce.map.sort.spill.percent?
>>>>>>> - What's the general process for ApplicationMaster of Yarn to
>>>>>>> execute a job?
>>>>>>>
>>>>>>> 2. In Hadoop 1.x, we can set the map/reduce slots by setting
>>>>>>> 'mapred.tasktracker.map.tasks.maximum' and
>>>>>>> 'mapred.tasktracker.reduce.tasks.maximum'
>>>>>>> - For Yarn, above tow parameter do not work any more, as yarn uses
>>>>>>> container instead, right?
>>>>>>> - For Yarn, we can set the whole physical mem for a NodeManager
>>>>>>> using 'yarn.nodemanager.resource.memory-mb'. But how to set the
>>>>>>> default size of physical mem of a container?
>>>>>>> - How to set the maximum size of physical mem of a container? By the
>>>>>>> parameter of 'mapred.child.java.opts'?
>>>>>>>
>>>>>>> Thanks as always!
>>>>>>>
>>>>>>> 2013/6/9 Harsh J <harsh@cloudera.com>
>>>>>>>
>>>>>>>> Hi Sam,
>>>>>>>>
>>>>>>>> > - How to know the container number? Why you say it will be 22
>>>>>>>> containers due to a 22 GB memory?
>>>>>>>>
>>>>>>>> The MR2's default configuration requests 1 GB resource each for Map
>>>>>>>> and Reduce containers. It requests 1.5 GB for the AM container that
>>>>>>>> runs the job, additionally. This is tunable using the properties
>>>>>>>> Sandy's mentioned in his post.
>>>>>>>>
>>>>>>>> > - My machine has 32 GB memory, how many memory is proper to be
>>>>>>>> assigned to containers?
>>>>>>>>
>>>>>>>> This is a general question. You may use the same process you took to
>>>>>>>> decide optimal number of slots in MR1 to decide this here. Every
>>>>>>>> container is a new JVM, and you're limited by the CPUs you have
>>>>>>>> there
>>>>>>>> (if not the memory). Either increase memory requests from jobs, to
>>>>>>>> lower # of concurrent containers at a given time (runtime change),
>>>>>>>> or
>>>>>>>> lower NM's published memory resources to control the same (config
>>>>>>>> change).
>>>>>>>>
>>>>>>>> > - In mapred-site.xml, if I set 'mapreduce.framework.name' to be
>>>>>>>> 'yarn', will other parameters for mapred-site.xml still work in yarn
>>>>>>>> framework? Like 'mapreduce.task.io.sort.mb' and
>>>>>>>> 'mapreduce.map.sort.spill.percent'
>>>>>>>>
>>>>>>>> Yes, all of these properties will still work. Old properties
>>>>>>>> specific
>>>>>>>> to JobTracker or TaskTracker (usually found as a keyword in the
>>>>>>>> config
>>>>>>>> name) will not apply anymore.
>>>>>>>>
>>>>>>>> On Sun, Jun 9, 2013 at 2:21 PM, sam liu <samliuhadoop@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > Hi Harsh,
>>>>>>>> >
>>>>>>>> > According to above suggestions, I removed the duplication of
>>>>>>>> setting, and
>>>>>>>> > reduce the value of 'yarn.nodemanager.resource.cpu-cores',
>>>>>>>> > 'yarn.nodemanager.vcores-pcores-ratio' and
>>>>>>>> > 'yarn.nodemanager.resource.memory-mb' to 16, 8 and 12000. Ant
>>>>>>>> then, the
>>>>>>>> > efficiency improved about 18%.  I have questions:
>>>>>>>> >
>>>>>>>> > - How to know the container number? Why you say it will be 22
>>>>>>>> containers due
>>>>>>>> > to a 22 GB memory?
>>>>>>>> > - My machine has 32 GB memory, how many memory is proper to be
>>>>>>>> assigned to
>>>>>>>> > containers?
>>>>>>>> > - In mapred-site.xml, if I set 'mapreduce.framework.name' to be
>>>>>>>> 'yarn', will
>>>>>>>> > other parameters for mapred-site.xml still work in yarn
>>>>>>>> framework? Like
>>>>>>>> > 'mapreduce.task.io.sort.mb' and 'mapreduce.map.sort.spill.percent'
>>>>>>>> >
>>>>>>>> > Thanks!
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > 2013/6/8 Harsh J <harsh@cloudera.com>
>>>>>>>> >>
>>>>>>>> >> Hey Sam,
>>>>>>>> >>
>>>>>>>> >> Did you get a chance to retry with Sandy's suggestions? The
>>>>>>>> config
>>>>>>>> >> appears to be asking NMs to use roughly 22 total containers (as
>>>>>>>> >> opposed to 12 total tasks in MR1 config) due to a 22 GB memory
>>>>>>>> >> resource. This could impact much, given the CPU is still the
>>>>>>>> same for
>>>>>>>> >> both test runs.
>>>>>>>> >>
>>>>>>>> >> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza <
>>>>>>>> sandy.ryza@cloudera.com>
>>>>>>>> >> wrote:
>>>>>>>> >> > Hey Sam,
>>>>>>>> >> >
>>>>>>>> >> > Thanks for sharing your results.  I'm definitely curious about
>>>>>>>> what's
>>>>>>>> >> > causing the difference.
>>>>>>>> >> >
>>>>>>>> >> > A couple observations:
>>>>>>>> >> > It looks like you've got yarn.nodemanager.resource.memory-mb
>>>>>>>> in there
>>>>>>>> >> > twice
>>>>>>>> >> > with two different values.
>>>>>>>> >> >
>>>>>>>> >> > Your max JVM memory of 1000 MB is (dangerously?) close to the
>>>>>>>> default
>>>>>>>> >> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any of your
>>>>>>>> tasks getting
>>>>>>>> >> > killed for running over resource limits?
>>>>>>>> >> >
>>>>>>>> >> > -Sandy
>>>>>>>> >> >
>>>>>>>> >> >
>>>>>>>> >> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu <
>>>>>>>> samliuhadoop@gmail.com> wrote:
>>>>>>>> >> >>
>>>>>>>> >> >> The terasort execution log shows that reduce spent about 5.5
>>>>>>>> mins from
>>>>>>>> >> >> 33%
>>>>>>>> >> >> to 35% as below.
>>>>>>>> >> >> 13/06/10 08:02:22 INFO mapreduce.Job:  map 100% reduce 31%
>>>>>>>> >> >> 13/06/10 08:02:25 INFO mapreduce.Job:  map 100% reduce 32%
>>>>>>>> >> >> 13/06/10 08:02:46 INFO mapreduce.Job:  map 100% reduce 33%
>>>>>>>> >> >> 13/06/10 08:08:16 INFO mapreduce.Job:  map 100% reduce 35%
>>>>>>>> >> >> 13/06/10 08:08:19 INFO mapreduce.Job:  map 100% reduce 40%
>>>>>>>> >> >> 13/06/10 08:08:22 INFO mapreduce.Job:  map 100% reduce 43%
>>>>>>>> >> >>
>>>>>>>> >> >> Any way, below are my configurations for your reference.
>>>>>>>> Thanks!
>>>>>>>> >> >> (A) core-site.xml
>>>>>>>> >> >> only define 'fs.default.name' and 'hadoop.tmp.dir'
>>>>>>>> >> >>
>>>>>>>> >> >> (B) hdfs-site.xml
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>dfs.replication</name>
>>>>>>>> >> >>     <value>1</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>dfs.name.dir</name>
>>>>>>>> >> >>
>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_name_dir</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>dfs.data.dir</name>
>>>>>>>> >> >>
>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_data_dir</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>dfs.block.size</name>
>>>>>>>> >> >>     <value>134217728</value><!-- 128MB -->
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>dfs.namenode.handler.count</name>
>>>>>>>> >> >>     <value>64</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>dfs.datanode.handler.count</name>
>>>>>>>> >> >>     <value>10</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >> (C) mapred-site.xml
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>mapreduce.cluster.temp.dir</name>
>>>>>>>> >> >>
>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_temp</value>
>>>>>>>> >> >>     <description>No description</description>
>>>>>>>> >> >>     <final>true</final>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>mapreduce.cluster.local.dir</name>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_local_dir</value>
>>>>>>>> >> >>     <description>No description</description>
>>>>>>>> >> >>     <final>true</final>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >> <property>
>>>>>>>> >> >>   <name>mapreduce.child.java.opts</name>
>>>>>>>> >> >>   <value>-Xmx1000m</value>
>>>>>>>> >> >> </property>
>>>>>>>> >> >>
>>>>>>>> >> >> <property>
>>>>>>>> >> >>     <name>mapreduce.framework.name</name>
>>>>>>>> >> >>     <value>yarn</value>
>>>>>>>> >> >>    </property>
>>>>>>>> >> >>
>>>>>>>> >> >>  <property>
>>>>>>>> >> >>     <name>mapreduce.tasktracker.map.tasks.maximum</name>
>>>>>>>> >> >>     <value>8</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
>>>>>>>> >> >>     <value>4</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>mapreduce.tasktracker.outofband.heartbeat</name>
>>>>>>>> >> >>     <value>true</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >> (D) yarn-site.xml
>>>>>>>> >> >>  <property>
>>>>>>>> >> >>     <name>yarn.resourcemanager.resource-tracker.address</name>
>>>>>>>> >> >>     <value>node1:18025</value>
>>>>>>>> >> >>     <description>host is the hostname of the resource manager
>>>>>>>> and
>>>>>>>> >> >>     port is the port on which the NodeManagers contact the
>>>>>>>> Resource
>>>>>>>> >> >> Manager.
>>>>>>>> >> >>     </description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <description>The address of the RM web
>>>>>>>> application.</description>
>>>>>>>> >> >>     <name>yarn.resourcemanager.webapp.address</name>
>>>>>>>> >> >>     <value>node1:18088</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.resourcemanager.scheduler.address</name>
>>>>>>>> >> >>     <value>node1:18030</value>
>>>>>>>> >> >>     <description>host is the hostname of the resourcemanager
>>>>>>>> and port
>>>>>>>> >> >> is
>>>>>>>> >> >> the port
>>>>>>>> >> >>     on which the Applications in the cluster talk to the
>>>>>>>> Resource
>>>>>>>> >> >> Manager.
>>>>>>>> >> >>     </description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.resourcemanager.address</name>
>>>>>>>> >> >>     <value>node1:18040</value>
>>>>>>>> >> >>     <description>the host is the hostname of the
>>>>>>>> ResourceManager and
>>>>>>>> >> >> the
>>>>>>>> >> >> port is the port on
>>>>>>>> >> >>     which the clients can talk to the Resource Manager.
>>>>>>>> </description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.local-dirs</name>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_local_dir</value>
>>>>>>>> >> >>     <description>the local directories used by the
>>>>>>>> >> >> nodemanager</description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.address</name>
>>>>>>>> >> >>     <value>0.0.0.0:18050</value>
>>>>>>>> >> >>     <description>the nodemanagers bind to this
>>>>>>>> port</description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>> >> >>     <value>10240</value>
>>>>>>>> >> >>     <description>the amount of memory on the NodeManager in
>>>>>>>> >> >> GB</description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.remote-app-log-dir</name>
>>>>>>>> >> >>
>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_app-logs</value>
>>>>>>>> >> >>     <description>directory on hdfs where the application logs
>>>>>>>> are moved
>>>>>>>> >> >> to
>>>>>>>> >> >> </description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>    <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.log-dirs</name>
>>>>>>>> >> >>
>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_log</value>
>>>>>>>> >> >>     <description>the directories used by Nodemanagers as log
>>>>>>>> >> >> directories</description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.aux-services</name>
>>>>>>>> >> >>     <value>mapreduce.shuffle</value>
>>>>>>>> >> >>     <description>shuffle service that needs to be set for Map
>>>>>>>> Reduce to
>>>>>>>> >> >> run </description>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>   <property>
>>>>>>>> >> >>     <name>yarn.resourcemanager.client.thread-count</name>
>>>>>>>> >> >>     <value>64</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>  <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.resource.cpu-cores</name>
>>>>>>>> >> >>     <value>24</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >> <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.vcores-pcores-ratio</name>
>>>>>>>> >> >>     <value>3</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>  <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>> >> >>     <value>22000</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>  <property>
>>>>>>>> >> >>     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>>>>>>>> >> >>     <value>2.1</value>
>>>>>>>> >> >>   </property>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >> 2013/6/7 Harsh J <harsh@cloudera.com>
>>>>>>>> >> >>>
>>>>>>>> >> >>> Not tuning configurations at all is wrong. YARN uses memory
>>>>>>>> resource
>>>>>>>> >> >>> based scheduling and hence MR2 would be requesting 1 GB
>>>>>>>> minimum by
>>>>>>>> >> >>> default, causing, on base configs, to max out at 8 (due to 8
>>>>>>>> GB NM
>>>>>>>> >> >>> memory resource config) total containers. Do share your
>>>>>>>> configs as at
>>>>>>>> >> >>> this point none of us can tell what it is.
>>>>>>>> >> >>>
>>>>>>>> >> >>> Obviously, it isn't our goal to make MR2 slower for users
>>>>>>>> and to not
>>>>>>>> >> >>> care about such things :)
>>>>>>>> >> >>>
>>>>>>>> >> >>> On Fri, Jun 7, 2013 at 8:45 AM, sam liu <
>>>>>>>> samliuhadoop@gmail.com>
>>>>>>>> >> >>> wrote:
>>>>>>>> >> >>> > At the begining, I just want to do a fast comparision of
>>>>>>>> MRv1 and
>>>>>>>> >> >>> > Yarn.
>>>>>>>> >> >>> > But
>>>>>>>> >> >>> > they have many differences, and to be fair for comparison
>>>>>>>> I did not
>>>>>>>> >> >>> > tune
>>>>>>>> >> >>> > their configurations at all.  So I got above test results.
>>>>>>>> After
>>>>>>>> >> >>> > analyzing
>>>>>>>> >> >>> > the test result, no doubt, I will configure them and do
>>>>>>>> comparison
>>>>>>>> >> >>> > again.
>>>>>>>> >> >>> >
>>>>>>>> >> >>> > Do you have any idea on current test result? I think, to
>>>>>>>> compare
>>>>>>>> >> >>> > with
>>>>>>>> >> >>> > MRv1,
>>>>>>>> >> >>> > Yarn is better on Map phase(teragen test), but worse on
>>>>>>>> Reduce
>>>>>>>> >> >>> > phase(terasort test).
>>>>>>>> >> >>> > And any detailed suggestions/comments/materials on Yarn
>>>>>>>> performance
>>>>>>>> >> >>> > tunning?
>>>>>>>> >> >>> >
>>>>>>>> >> >>> > Thanks!
>>>>>>>> >> >>> >
>>>>>>>> >> >>> >
>>>>>>>> >> >>> > 2013/6/7 Marcos Luis Ortiz Valmaseda <
>>>>>>>> marcosluis2186@gmail.com>
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >> Why not to tune the configurations?
>>>>>>>> >> >>> >> Both frameworks have many areas to tune:
>>>>>>>> >> >>> >> - Combiners, Shuffle optimization, Block size, etc
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >> 2013/6/6 sam liu <samliuhadoop@gmail.com>
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> Hi Experts,
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> We are thinking about whether to use Yarn or not in the
>>>>>>>> near
>>>>>>>> >> >>> >>> future,
>>>>>>>> >> >>> >>> and
>>>>>>>> >> >>> >>> I ran teragen/terasort on Yarn and MRv1 for comprison.
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> My env is three nodes cluster, and each node has similar
>>>>>>>> hardware:
>>>>>>>> >> >>> >>> 2
>>>>>>>> >> >>> >>> cpu(4 core), 32 mem. Both Yarn and MRv1 cluster are set
>>>>>>>> on the
>>>>>>>> >> >>> >>> same
>>>>>>>> >> >>> >>> env. To
>>>>>>>> >> >>> >>> be fair, I did not make any performance tuning on their
>>>>>>>> >> >>> >>> configurations, but
>>>>>>>> >> >>> >>> use the default configuration values.
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> Before testing, I think Yarn will be much better than
>>>>>>>> MRv1, if
>>>>>>>> >> >>> >>> they
>>>>>>>> >> >>> >>> all
>>>>>>>> >> >>> >>> use default configuration, because Yarn is a better
>>>>>>>> framework than
>>>>>>>> >> >>> >>> MRv1.
>>>>>>>> >> >>> >>> However, the test result shows some differences:
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> MRv1: Hadoop-1.1.1
>>>>>>>> >> >>> >>> Yarn: Hadoop-2.0.4
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> (A) Teragen: generate 10 GB data:
>>>>>>>> >> >>> >>> - MRv1: 193 sec
>>>>>>>> >> >>> >>> - Yarn: 69 sec
>>>>>>>> >> >>> >>> Yarn is 2.8 times better than MRv1
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> (B) Terasort: sort 10 GB data:
>>>>>>>> >> >>> >>> - MRv1: 451 sec
>>>>>>>> >> >>> >>> - Yarn: 1136 sec
>>>>>>>> >> >>> >>> Yarn is 2.5 times worse than MRv1
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> After a fast analysis, I think the direct cause might be
>>>>>>>> that Yarn
>>>>>>>> >> >>> >>> is
>>>>>>>> >> >>> >>> much faster than MRv1 on Map phase, but much worse on
>>>>>>>> Reduce
>>>>>>>> >> >>> >>> phase.
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> Here I have two questions:
>>>>>>>> >> >>> >>> - Why my tests shows Yarn is worse than MRv1 for
>>>>>>>> terasort?
>>>>>>>> >> >>> >>> - What's the stratage for tuning Yarn performance? Is any
>>>>>>>> >> >>> >>> materials?
>>>>>>>> >> >>> >>>
>>>>>>>> >> >>> >>> Thanks!
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >> --
>>>>>>>> >> >>> >> Marcos Ortiz Valmaseda
>>>>>>>> >> >>> >> Product Manager at PDVSA
>>>>>>>> >> >>> >> http://about.me/marcosortiz
>>>>>>>> >> >>> >>
>>>>>>>> >> >>> >
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> >> >>>
>>>>>>>> >> >>> --
>>>>>>>> >> >>> Harsh J
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> --
>>>>>>>> >> Harsh J
>>>>>>>> >
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Harsh J
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message