hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jian Fang <jian.fang.subscr...@gmail.com>
Subject Re: Why my tests shows Yarn is worse than MRv1 for terasort?
Date Thu, 24 Oct 2013 00:01:49 GMT
Really? Does that mean I cannot compare terasort in MR2 with MR1 directly?
If yes, this really should be documented.

Thanks.


On Wed, Oct 23, 2013 at 4:54 PM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> I should have brought this up earlier - the Terasort benchmark
> requirements changed recently to make data less compressible.  The MR2
> version of Terasort has this change, but the MR1 version does not.    So
> Snappy should be working fine, but the data that MR2 is using is less
> compressible.
>
>
> On Wed, Oct 23, 2013 at 4:48 PM, Jian Fang <jian.fang.subscribe@gmail.com>wrote:
>
>> Ok, I think I know where the problem is now. I used snappy to compress
>> map output.
>>
>> Calculate the compression ratio = "Map output materialized bytes"/"Map
>> output bytes"
>>
>> MR2
>> 440519309909/1020000000000 = 0.431881676
>>
>> MR1
>> 240948272514/1000000000000 = 0.240948273
>>
>>
>> Seems something is wrong with the snappy in my Hadoop 2 and as a result,
>> MR2 needs to fetch much more data during shuffle phase. Any way to check
>> snappy in MR2?
>>
>>
>>
>>
>> On Wed, Oct 23, 2013 at 4:26 PM, Jian Fang <jian.fang.subscribe@gmail.com
>> > wrote:
>>
>>> Looked at the results for MR1 and MR2. Reduce input groups for MR1 is
>>> 4,294,967,296, but 10,000,000,000 for MR2.
>>> That is to say the number of unique keys fed into the reducers is
>>> 10000000000 in MR2, which is really wired. Any problem in the terasort code?
>>>
>>>
>>> On Wed, Oct 23, 2013 at 2:16 PM, Jian Fang <
>>> jian.fang.subscribe@gmail.com> wrote:
>>>
>>>> Reducing the number of map containers to 8 slightly improve the total
>>>> time to 84 minutes. Here is output. Also, from the log, there is no clear
>>>> reason why the containers are killed other than the message such as
>>>> "Container killed by the
>>>> ApplicationMaster../container_1382237301855_0001_01_000001/syslog:Container
>>>> killed on request. Exit code is 143"
>>>>
>>>>
>>>> 2013-10-23 21:09:34,325 INFO org.apache.hadoop.mapreduce.Job (main):
>>>> Counters: 46
>>>>         File System Counters
>>>>                 FILE: Number of bytes read=455484809066
>>>>                 FILE: Number of bytes written=896642344343
>>>>
>>>>                 FILE: Number of read operations=0
>>>>                 FILE: Number of large read operations=0
>>>>                 FILE: Number of write operations=0
>>>>                 HDFS: Number of bytes read=1000000841624
>>>>
>>>>                 HDFS: Number of bytes written=1000000000000
>>>>                 HDFS: Number of read operations=25531
>>>>
>>>>                 HDFS: Number of large read operations=0
>>>>                 HDFS: Number of write operations=150
>>>>
>>>>         Job Counters
>>>>                 Killed map tasks=1
>>>>                 Killed reduce tasks=11
>>>>                 Launched map tasks=7449
>>>>                 Launched reduce tasks=86
>>>>                 Data-local map tasks=7434
>>>>                 Rack-local map tasks=15
>>>>                 Total time spent by all maps in occupied slots
>>>> (ms)=1030941232
>>>>                 Total time spent by all reduces in occupied slots
>>>> (ms)=1574732272
>>>>
>>>>         Map-Reduce Framework
>>>>                 Map input records=10000000000
>>>>                 Map output records=10000000000
>>>>                 Map output bytes=1020000000000
>>>>                 Map output materialized bytes=440519309909
>>>>                 Input split bytes=841624
>>>>
>>>>                 Combine input records=0
>>>>                 Combine output records=0
>>>>                 Reduce input groups=10000000000
>>>>                 Reduce shuffle bytes=440519309909
>>>>
>>>>                 Reduce input records=10000000000
>>>>                 Reduce output records=10000000000
>>>>                 Spilled Records=20000000000
>>>>                 Shuffled Maps =558600
>>>>                 Failed Shuffles=0
>>>>                 Merged Map outputs=558600
>>>>                 GC time elapsed (ms)=2855534
>>>>                 CPU time spent (ms)=170381420
>>>>                 Physical memory (bytes) snapshot=3704938954752
>>>>                 Virtual memory (bytes) snapshot=11455208308736
>>>>                 Total committed heap usage (bytes)=4523248582656
>>>>         Shuffle Errors
>>>>                 BAD_ID=0
>>>>                 CONNECTION=0
>>>>                 IO_ERROR=0
>>>>
>>>>                 WRONG_LENGTH=0
>>>>                 WRONG_MAP=0
>>>>                 WRONG_REDUCE=0
>>>>         File Input Format Counters
>>>>                 Bytes Read=1000000000000
>>>>         File Output Format Counters
>>>>                 Bytes Written=1000000000000
>>>>
>>>>
>>>>
>>>> On Wed, Oct 23, 2013 at 1:46 PM, Jian Fang <
>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>
>>>>> Already started the tests with only 8 containers for map in MR2, still
>>>>> running.
>>>>>
>>>>> But shouldn't YARN make better use of the memory? If I map YARN
>>>>> containers in MR2 to exact 8 map and 3 reduce slots in MR1, what is the
>>>>> real advantage of YARN then? I remember one goal of YARN is to solve the
>>>>> issue that map slots cannot be used for reduce and reduce slots cannot be
>>>>> used for map. Shouldn't YARN be smart enough to handle the concurrent tasks?
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 23, 2013 at 1:17 PM, Sandy Ryza <sandy.ryza@cloudera.com>wrote:
>>>>>
>>>>>> Increasing the slowstart is not meant to increase performance, but
>>>>>> should make for a fairer comparison.  Have you tried making sure that in
>>>>>> MR2 only 8 map tasks are running concurrently, or boosting MR1 up to 16?
>>>>>>
>>>>>> -Sandy
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 23, 2013 at 12:55 PM, Jian Fang <
>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>
>>>>>>> Changing mapreduce.job.reduce.
>>>>>>> slowstart.completedmaps to 0.99 does not look good. The map phase
>>>>>>> alone took 48 minutes and total time seems to be even longer. Any way to
>>>>>>> let map phase run faster?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 23, 2013 at 10:05 AM, Jian Fang <
>>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>>
>>>>>>>> Thanks Sandy.
>>>>>>>>
>>>>>>>> io.sort.record.percent is the default value 0.05 for both MR1 and
>>>>>>>> MR2.  mapreduce.job.reduce.slowstart.completedmaps in MR2 and mapred.reduce.slowstart.completed.maps
>>>>>>>> in MR1 both use the default value 0.05.
>>>>>>>>
>>>>>>>> I tried to allocate 1536MB and 1024MB to map container some time
>>>>>>>> ago, but the changes did not give me a better result, thus, I changed it
>>>>>>>> back to 768MB.
>>>>>>>>
>>>>>>>> Will try mapred.reduce.slowstart.completed.maps=.99 to see what
>>>>>>>> happens. BTW, I should use
>>>>>>>> mapreduce.job.reduce.slowstart.completedmaps in MR2, right?
>>>>>>>>
>>>>>>>> Also, in MR1 I can specify tasktracker.http.threads, but I could
>>>>>>>> not find the counterpart for MR2. Which one I should tune for the http
>>>>>>>> thread?
>>>>>>>>
>>>>>>>> Thanks again.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 23, 2013 at 9:40 AM, Sandy Ryza <
>>>>>>>> sandy.ryza@cloudera.com> wrote:
>>>>>>>>
>>>>>>>>> Based on SLOTS_MILLIS_MAPS, it looks like your map tasks are
>>>>>>>>> taking about three times as long in MR2 as they are in MR1.  This is
>>>>>>>>> probably because you allow twice as many map tasks to run at a time in MR2 (12288/768
>>>>>>>>> = 16).  Being able to use all the containers isn't necessarily a good thing
>>>>>>>>> if you are oversubscribing your node's resources.  Because of the different
>>>>>>>>> way that MR1 and MR2 view resources, I think it's better to test with
>>>>>>>>> mapred.reduce.slowstart.completed.maps=.99 so that the map and reduce
>>>>>>>>> phases will run separately.
>>>>>>>>>
>>>>>>>>> On the other side, it looks like your MR1 has more spilled records
>>>>>>>>> than MR2.  For a fairer comparison, you should set io.sort.record.percent
>>>>>>>>> in MR1 to .13, which should improve MR1 performance, but will provide a
>>>>>>>>> fairer comparison (MR2 automatically does this tuning for you).
>>>>>>>>>
>>>>>>>>> -Sandy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Oct 23, 2013 at 9:22 AM, Jian Fang <
>>>>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> The number of map slots and reduce slots on each data node for
>>>>>>>>>> MR1 are 8 and 3, respectively. Since MR2 could use all containers for
>>>>>>>>>> either map or reduce, I would expect that MR2 is faster.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Oct 23, 2013 at 8:17 AM, Sandy Ryza <
>>>>>>>>>> sandy.ryza@cloudera.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> How many map and reduce slots are you using per tasktracker in
>>>>>>>>>>> MR1?  How do the average map times compare? (MR2 reports this directly on
>>>>>>>>>>> the web UI, but you can also get a sense in MR1 by scrolling through the
>>>>>>>>>>> map tasks page).  Can you share the counters for MR1?
>>>>>>>>>>>
>>>>>>>>>>> -Sandy
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 23, 2013 at 12:23 AM, Jian Fang <
>>>>>>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately, turning off JVM reuse still got the same result,
>>>>>>>>>>>> i.e., about 90 minutes for MR2. I don't think the killed reduces could
>>>>>>>>>>>> contribute to 2 times slowness. There should be something very wrong either
>>>>>>>>>>>> in configuration or code. Any hints?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Oct 22, 2013 at 5:50 PM, Jian Fang <
>>>>>>>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks Sandy. I will try to turn JVM resue off and see what
>>>>>>>>>>>>> happens.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, I saw quite some exceptions in the task attempts. For
>>>>>>>>>>>>> instance.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2013-10-20 03:13:58,751 ERROR [main]
>>>>>>>>>>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>>>>>>>>>>> as:hadoop (auth:SIMPLE) cause:java.nio.channels.ClosedChannelException
>>>>>>>>>>>>> 2013-10-20 03:13:58,752 ERROR [Thread-6]
>>>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient: Failed to close file
>>>>>>>>>>>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200
>>>>>>>>>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>>>>>>>>>>>>> No lease on
>>>>>>>>>>>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200:
>>>>>>>>>>>>> File does not exist. Holder
>>>>>>>>>>>>> DFSClient_attempt_1382237301855_0001_m_000200_1_872378586_1 does not have
>>>>>>>>>>>>> any open files.
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2737)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2801)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2783)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:611)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:429)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48077)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:582)
>>>>>>>>>>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>>>>>>>>>>>>> --
>>>>>>>>>>>>>         at com.sun.proxy.$Proxy10.complete(Unknown Source)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:371)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1910)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1896)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:773)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:790)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:847)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2526)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2551)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>>>>>>>>>>>>> 2013-10-20 03:13:58,753 WARN [main]
>>>>>>>>>>>>> org.apache.hadoop.mapred.YarnChild: Exception running child :
>>>>>>>>>>>>> java.nio.channels.ClosedChannelException
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1325)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:61)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> java.io.DataOutputStream.write(DataOutputStream.java:107)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:69)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:57)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:646)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>>>>>>>>>>>>>         at
>>>>>>>>>>>>> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:230)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Oct 22, 2013 at 4:45 PM, Sandy Ryza <
>>>>>>>>>>>>> sandy.ryza@cloudera.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like many of your reduce tasks were killed.  Do you
>>>>>>>>>>>>>> know why?  Also, MR2 doesn't have JVM reuse, so it might make sense to
>>>>>>>>>>>>>> compare it to MR1 with JVM reuse turned off.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Sandy
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Oct 22, 2013 at 3:06 PM, Jian Fang <
>>>>>>>>>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The Terasort output for MR2 is as follows.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2013-10-22 21:40:16,261 INFO org.apache.hadoop.mapreduce.Job
>>>>>>>>>>>>>>> (main): Counters: 46
>>>>>>>>>>>>>>>         File System Counters
>>>>>>>>>>>>>>>                 FILE: Number of bytes read=456102049355
>>>>>>>>>>>>>>>                 FILE: Number of bytes written=897246250517
>>>>>>>>>>>>>>>                 FILE: Number of read operations=0
>>>>>>>>>>>>>>>                 FILE: Number of large read operations=0
>>>>>>>>>>>>>>>                 FILE: Number of write operations=0
>>>>>>>>>>>>>>>                 HDFS: Number of bytes read=1000000851200
>>>>>>>>>>>>>>>                 HDFS: Number of bytes written=1000000000000
>>>>>>>>>>>>>>>                 HDFS: Number of read operations=32131
>>>>>>>>>>>>>>>                 HDFS: Number of large read operations=0
>>>>>>>>>>>>>>>                 HDFS: Number of write operations=224
>>>>>>>>>>>>>>>         Job Counters
>>>>>>>>>>>>>>>                 Killed map tasks=1
>>>>>>>>>>>>>>>                 Killed reduce tasks=20
>>>>>>>>>>>>>>>                 Launched map tasks=7601
>>>>>>>>>>>>>>>                 Launched reduce tasks=132
>>>>>>>>>>>>>>>                 Data-local map tasks=7591
>>>>>>>>>>>>>>>                 Rack-local map tasks=10
>>>>>>>>>>>>>>>                 Total time spent by all maps in occupied
>>>>>>>>>>>>>>> slots (ms)=1696141311
>>>>>>>>>>>>>>>                 Total time spent by all reduces in occupied
>>>>>>>>>>>>>>> slots (ms)=2664045096
>>>>>>>>>>>>>>>         Map-Reduce Framework
>>>>>>>>>>>>>>>                 Map input records=10000000000
>>>>>>>>>>>>>>>                 Map output records=10000000000
>>>>>>>>>>>>>>>                 Map output bytes=1020000000000
>>>>>>>>>>>>>>>                 Map output materialized bytes=440486356802
>>>>>>>>>>>>>>>                 Input split bytes=851200
>>>>>>>>>>>>>>>                 Combine input records=0
>>>>>>>>>>>>>>>                 Combine output records=0
>>>>>>>>>>>>>>>                 Reduce input groups=10000000000
>>>>>>>>>>>>>>>                 Reduce shuffle bytes=440486356802
>>>>>>>>>>>>>>>                 Reduce input records=10000000000
>>>>>>>>>>>>>>>                 Reduce output records=10000000000
>>>>>>>>>>>>>>>                 Spilled Records=20000000000
>>>>>>>>>>>>>>>                 Shuffled Maps =851200
>>>>>>>>>>>>>>>                 Failed Shuffles=61
>>>>>>>>>>>>>>>                 Merged Map outputs=851200
>>>>>>>>>>>>>>>                 GC time elapsed (ms)=4215666
>>>>>>>>>>>>>>>                 CPU time spent (ms)=192433000
>>>>>>>>>>>>>>>                 Physical memory (bytes)
>>>>>>>>>>>>>>> snapshot=3349356380160
>>>>>>>>>>>>>>>                 Virtual memory (bytes) snapshot=9665208745984
>>>>>>>>>>>>>>>                 Total committed heap usage
>>>>>>>>>>>>>>> (bytes)=3636854259712
>>>>>>>>>>>>>>>         Shuffle Errors
>>>>>>>>>>>>>>>                 BAD_ID=0
>>>>>>>>>>>>>>>                 CONNECTION=0
>>>>>>>>>>>>>>>                 IO_ERROR=4
>>>>>>>>>>>>>>>                 WRONG_LENGTH=0
>>>>>>>>>>>>>>>                 WRONG_MAP=0
>>>>>>>>>>>>>>>                 WRONG_REDUCE=0
>>>>>>>>>>>>>>>         File Input Format Counters
>>>>>>>>>>>>>>>                 Bytes Read=1000000000000
>>>>>>>>>>>>>>>         File Output Format Counters
>>>>>>>>>>>>>>>                 Bytes Written=1000000000000
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Oct 22, 2013 at 2:44 PM, Jian Fang <
>>>>>>>>>>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have the same problem. I compared Hadoop 2.2.0 with
>>>>>>>>>>>>>>>> Hadoop 1.0.3 and it turned out that the terasort for MR2 is 2 times slower
>>>>>>>>>>>>>>>> than that in MR1. I cannot really believe it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The cluster has 20 nodes with 19 data nodes.  My Hadoop
>>>>>>>>>>>>>>>> 2.2.0 cluster configurations are as follows.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         mapreduce.map.java.opts = "-Xmx512m";
>>>>>>>>>>>>>>>>         mapreduce.reduce.java.opts = "-Xmx1536m";
>>>>>>>>>>>>>>>>         mapreduce.map.memory.mb = "768";
>>>>>>>>>>>>>>>>         mapreduce.reduce.memory.mb = "2048";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         yarn.scheduler.minimum-allocation-mb = "256";
>>>>>>>>>>>>>>>>         yarn.scheduler.maximum-allocation-mb = "8192";
>>>>>>>>>>>>>>>>         yarn.nodemanager.resource.memory-mb = "12288";
>>>>>>>>>>>>>>>>         yarn.nodemanager.resource.cpu-vcores = "16";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         mapreduce.reduce.shuffle.parallelcopies = "20";
>>>>>>>>>>>>>>>>         mapreduce.task.io.sort.factor = "48";
>>>>>>>>>>>>>>>>         mapreduce.task.io.sort.mb = "200";
>>>>>>>>>>>>>>>>         mapreduce.map.speculative = "true";
>>>>>>>>>>>>>>>>         mapreduce.reduce.speculative = "true";
>>>>>>>>>>>>>>>>         mapreduce.framework.name = "yarn";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> yarn.app.mapreduce.am.job.task.listener.thread-count = "60";
>>>>>>>>>>>>>>>>         mapreduce.map.cpu.vcores = "1";
>>>>>>>>>>>>>>>>         mapreduce.reduce.cpu.vcores = "2";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         mapreduce.job.jvm.numtasks = "20";
>>>>>>>>>>>>>>>>         mapreduce.map.output.compress = "true";
>>>>>>>>>>>>>>>>         mapreduce.map.output.compress.codec =
>>>>>>>>>>>>>>>> "org.apache.hadoop.io.compress.SnappyCodec";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>         yarn.resourcemanager.client.thread-count = "64";
>>>>>>>>>>>>>>>>         yarn.resourcemanager.scheduler.client.thread-count
>>>>>>>>>>>>>>>> = "64";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> yarn.resourcemanager.resource-tracker.client.thread-count = "64";
>>>>>>>>>>>>>>>>         yarn.resourcemanager.scheduler.class =
>>>>>>>>>>>>>>>> "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler";
>>>>>>>>>>>>>>>>         yarn.nodemanager.aux-services = "mapreduce_shuffle";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> yarn.nodemanager.aux-services.mapreduce.shuffle.class =
>>>>>>>>>>>>>>>> "org.apache.hadoop.mapred.ShuffleHandler";
>>>>>>>>>>>>>>>>         yarn.nodemanager.vmem-pmem-ratio = "5";
>>>>>>>>>>>>>>>>         yarn.nodemanager.container-executor.class =
>>>>>>>>>>>>>>>> "org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor";
>>>>>>>>>>>>>>>>         yarn.nodemanager.container-manager.thread-count =
>>>>>>>>>>>>>>>> "64";
>>>>>>>>>>>>>>>>         yarn.nodemanager.localizer.client.thread-count =
>>>>>>>>>>>>>>>> "20";
>>>>>>>>>>>>>>>>         yarn.nodemanager.localizer.fetch.thread-count =
>>>>>>>>>>>>>>>> "20";
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> My Hadoop 1.0.3 has the same memory/disks/cores and almost
>>>>>>>>>>>>>>>> the same other configurations. In MR1, the 1TB terasort took about 45
>>>>>>>>>>>>>>>> minutes, but it took around 90 minutes in MR2.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Does anyone know what is wrong here? Or do I need some
>>>>>>>>>>>>>>>> special configurations for terasort to work better in MR2?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jun 18, 2013 at 3:11 AM, Michel Segel <
>>>>>>>>>>>>>>>> michael_segel@hotmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sam,
>>>>>>>>>>>>>>>>> I think your cluster is too small for any meaningful
>>>>>>>>>>>>>>>>> conclusions to be made.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Mike Segel
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Jun 18, 2013, at 3:58 AM, sam liu <
>>>>>>>>>>>>>>>>> samliuhadoop@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Harsh,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for your detailed response! Now, the efficiency of
>>>>>>>>>>>>>>>>> my Yarn cluster improved a lot after increasing the reducer
>>>>>>>>>>>>>>>>> number(mapreduce.job.reduces) in mapred-site.xml. But I still have some
>>>>>>>>>>>>>>>>> questions about the way of Yarn to execute MRv1 job:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1.In Hadoop 1.x, a job will be executed by map task and
>>>>>>>>>>>>>>>>> reduce task together, with a typical process(map > shuffle > reduce). In
>>>>>>>>>>>>>>>>> Yarn, as I know, a MRv1 job will be executed only by ApplicationMaster.
>>>>>>>>>>>>>>>>> - Yarn could run multiple kinds of jobs(MR, MPI, ...),
>>>>>>>>>>>>>>>>> but, MRv1 job has special execution process(map > shuffle > reduce) in
>>>>>>>>>>>>>>>>> Hadoop 1.x, and how Yarn execute a MRv1 job? still include some special MR
>>>>>>>>>>>>>>>>> steps in Hadoop 1.x, like map, sort, merge, combine and shuffle?
>>>>>>>>>>>>>>>>> - Do the MRv1 parameters still work for Yarn? Like
>>>>>>>>>>>>>>>>> mapreduce.task.io.sort.mb and mapreduce.map.sort.spill.percent?
>>>>>>>>>>>>>>>>> - What's the general process for ApplicationMaster of Yarn
>>>>>>>>>>>>>>>>> to execute a job?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2. In Hadoop 1.x, we can set the map/reduce slots by
>>>>>>>>>>>>>>>>> setting 'mapred.tasktracker.map.tasks.maximum' and
>>>>>>>>>>>>>>>>> 'mapred.tasktracker.reduce.tasks.maximum'
>>>>>>>>>>>>>>>>> - For Yarn, above tow parameter do not work any more, as
>>>>>>>>>>>>>>>>> yarn uses container instead, right?
>>>>>>>>>>>>>>>>> - For Yarn, we can set the whole physical mem for a
>>>>>>>>>>>>>>>>> NodeManager using 'yarn.nodemanager.resource.memory-mb'.
>>>>>>>>>>>>>>>>> But how to set the default size of physical mem of a container?
>>>>>>>>>>>>>>>>> - How to set the maximum size of physical mem of a
>>>>>>>>>>>>>>>>> container? By the parameter of 'mapred.child.java.opts'?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks as always!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2013/6/9 Harsh J <harsh@cloudera.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi Sam,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> > - How to know the container number? Why you say it will
>>>>>>>>>>>>>>>>>> be 22 containers due to a 22 GB memory?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The MR2's default configuration requests 1 GB resource
>>>>>>>>>>>>>>>>>> each for Map
>>>>>>>>>>>>>>>>>> and Reduce containers. It requests 1.5 GB for the AM
>>>>>>>>>>>>>>>>>> container that
>>>>>>>>>>>>>>>>>> runs the job, additionally. This is tunable using the
>>>>>>>>>>>>>>>>>> properties
>>>>>>>>>>>>>>>>>> Sandy's mentioned in his post.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> > - My machine has 32 GB memory, how many memory is
>>>>>>>>>>>>>>>>>> proper to be assigned to containers?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is a general question. You may use the same process
>>>>>>>>>>>>>>>>>> you took to
>>>>>>>>>>>>>>>>>> decide optimal number of slots in MR1 to decide this
>>>>>>>>>>>>>>>>>> here. Every
>>>>>>>>>>>>>>>>>> container is a new JVM, and you're limited by the CPUs
>>>>>>>>>>>>>>>>>> you have there
>>>>>>>>>>>>>>>>>> (if not the memory). Either increase memory requests from
>>>>>>>>>>>>>>>>>> jobs, to
>>>>>>>>>>>>>>>>>> lower # of concurrent containers at a given time (runtime
>>>>>>>>>>>>>>>>>> change), or
>>>>>>>>>>>>>>>>>> lower NM's published memory resources to control the same
>>>>>>>>>>>>>>>>>> (config
>>>>>>>>>>>>>>>>>> change).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> > - In mapred-site.xml, if I set '
>>>>>>>>>>>>>>>>>> mapreduce.framework.name' to be 'yarn', will other
>>>>>>>>>>>>>>>>>> parameters for mapred-site.xml still work in yarn framework? Like
>>>>>>>>>>>>>>>>>> 'mapreduce.task.io.sort.mb' and 'mapreduce.map.sort.spill.percent'
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Yes, all of these properties will still work. Old
>>>>>>>>>>>>>>>>>> properties specific
>>>>>>>>>>>>>>>>>> to JobTracker or TaskTracker (usually found as a keyword
>>>>>>>>>>>>>>>>>> in the config
>>>>>>>>>>>>>>>>>> name) will not apply anymore.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Jun 9, 2013 at 2:21 PM, sam liu <
>>>>>>>>>>>>>>>>>> samliuhadoop@gmail.com> wrote:
>>>>>>>>>>>>>>>>>> > Hi Harsh,
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > According to above suggestions, I removed the
>>>>>>>>>>>>>>>>>> duplication of setting, and
>>>>>>>>>>>>>>>>>> > reduce the value of
>>>>>>>>>>>>>>>>>> 'yarn.nodemanager.resource.cpu-cores',
>>>>>>>>>>>>>>>>>> > 'yarn.nodemanager.vcores-pcores-ratio' and
>>>>>>>>>>>>>>>>>> > 'yarn.nodemanager.resource.memory-mb' to 16, 8 and
>>>>>>>>>>>>>>>>>> 12000. Ant then, the
>>>>>>>>>>>>>>>>>> > efficiency improved about 18%.  I have questions:
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > - How to know the container number? Why you say it will
>>>>>>>>>>>>>>>>>> be 22 containers due
>>>>>>>>>>>>>>>>>> > to a 22 GB memory?
>>>>>>>>>>>>>>>>>> > - My machine has 32 GB memory, how many memory is
>>>>>>>>>>>>>>>>>> proper to be assigned to
>>>>>>>>>>>>>>>>>> > containers?
>>>>>>>>>>>>>>>>>> > - In mapred-site.xml, if I set '
>>>>>>>>>>>>>>>>>> mapreduce.framework.name' to be 'yarn', will
>>>>>>>>>>>>>>>>>> > other parameters for mapred-site.xml still work in yarn
>>>>>>>>>>>>>>>>>> framework? Like
>>>>>>>>>>>>>>>>>> > 'mapreduce.task.io.sort.mb' and
>>>>>>>>>>>>>>>>>> 'mapreduce.map.sort.spill.percent'
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > Thanks!
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > 2013/6/8 Harsh J <harsh@cloudera.com>
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> Hey Sam,
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> Did you get a chance to retry with Sandy's
>>>>>>>>>>>>>>>>>> suggestions? The config
>>>>>>>>>>>>>>>>>> >> appears to be asking NMs to use roughly 22 total
>>>>>>>>>>>>>>>>>> containers (as
>>>>>>>>>>>>>>>>>> >> opposed to 12 total tasks in MR1 config) due to a 22
>>>>>>>>>>>>>>>>>> GB memory
>>>>>>>>>>>>>>>>>> >> resource. This could impact much, given the CPU is
>>>>>>>>>>>>>>>>>> still the same for
>>>>>>>>>>>>>>>>>> >> both test runs.
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza <
>>>>>>>>>>>>>>>>>> sandy.ryza@cloudera.com>
>>>>>>>>>>>>>>>>>> >> wrote:
>>>>>>>>>>>>>>>>>> >> > Hey Sam,
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> > Thanks for sharing your results.  I'm definitely
>>>>>>>>>>>>>>>>>> curious about what's
>>>>>>>>>>>>>>>>>> >> > causing the difference.
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> > A couple observations:
>>>>>>>>>>>>>>>>>> >> > It looks like you've got
>>>>>>>>>>>>>>>>>> yarn.nodemanager.resource.memory-mb in there
>>>>>>>>>>>>>>>>>> >> > twice
>>>>>>>>>>>>>>>>>> >> > with two different values.
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> > Your max JVM memory of 1000 MB is (dangerously?)
>>>>>>>>>>>>>>>>>> close to the default
>>>>>>>>>>>>>>>>>> >> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any
>>>>>>>>>>>>>>>>>> of your tasks getting
>>>>>>>>>>>>>>>>>> >> > killed for running over resource limits?
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> > -Sandy
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu <
>>>>>>>>>>>>>>>>>> samliuhadoop@gmail.com> wrote:
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> The terasort execution log shows that reduce spent
>>>>>>>>>>>>>>>>>> about 5.5 mins from
>>>>>>>>>>>>>>>>>> >> >> 33%
>>>>>>>>>>>>>>>>>> >> >> to 35% as below.
>>>>>>>>>>>>>>>>>> >> >> 13/06/10 08:02:22 INFO mapreduce.Job:  map 100%
>>>>>>>>>>>>>>>>>> reduce 31%
>>>>>>>>>>>>>>>>>> >> >> 13/06/10 08:02:25 INFO mapreduce.Job:  map 100%
>>>>>>>>>>>>>>>>>> reduce 32%
>>>>>>>>>>>>>>>>>> >> >> 13/06/10 08:02:46 INFO mapreduce.Job:  map 100%
>>>>>>>>>>>>>>>>>> reduce 33%
>>>>>>>>>>>>>>>>>> >> >> 13/06/10 08:08:16 INFO mapreduce.Job:  map 100%
>>>>>>>>>>>>>>>>>> reduce 35%
>>>>>>>>>>>>>>>>>> >> >> 13/06/10 08:08:19 INFO mapreduce.Job:  map 100%
>>>>>>>>>>>>>>>>>> reduce 40%
>>>>>>>>>>>>>>>>>> >> >> 13/06/10 08:08:22 INFO mapreduce.Job:  map 100%
>>>>>>>>>>>>>>>>>> reduce 43%
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> Any way, below are my configurations for your
>>>>>>>>>>>>>>>>>> reference. Thanks!
>>>>>>>>>>>>>>>>>> >> >> (A) core-site.xml
>>>>>>>>>>>>>>>>>> >> >> only define 'fs.default.name' and 'hadoop.tmp.dir'
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> (B) hdfs-site.xml
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>dfs.replication</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>1</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>dfs.name.dir</name>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_name_dir</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>dfs.data.dir</name>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_data_dir</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>dfs.block.size</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>134217728</value><!-- 128MB -->
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>dfs.namenode.handler.count</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>64</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>dfs.datanode.handler.count</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>10</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> (C) mapred-site.xml
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>mapreduce.cluster.temp.dir</name>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_temp</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>No description</description>
>>>>>>>>>>>>>>>>>> >> >>     <final>true</final>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>mapreduce.cluster.local.dir</name>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_local_dir</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>No description</description>
>>>>>>>>>>>>>>>>>> >> >>     <final>true</final>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> <property>
>>>>>>>>>>>>>>>>>> >> >>   <name>mapreduce.child.java.opts</name>
>>>>>>>>>>>>>>>>>> >> >>   <value>-Xmx1000m</value>
>>>>>>>>>>>>>>>>>> >> >> </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>mapreduce.framework.name</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>yarn</value>
>>>>>>>>>>>>>>>>>> >> >>    </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>  <property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <name>mapreduce.tasktracker.map.tasks.maximum</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>8</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>4</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <name>mapreduce.tasktracker.outofband.heartbeat</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>true</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> (D) yarn-site.xml
>>>>>>>>>>>>>>>>>> >> >>  <property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <name>yarn.resourcemanager.resource-tracker.address</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>node1:18025</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>host is the hostname of the
>>>>>>>>>>>>>>>>>> resource manager and
>>>>>>>>>>>>>>>>>> >> >>     port is the port on which the NodeManagers
>>>>>>>>>>>>>>>>>> contact the Resource
>>>>>>>>>>>>>>>>>> >> >> Manager.
>>>>>>>>>>>>>>>>>> >> >>     </description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <description>The address of the RM web
>>>>>>>>>>>>>>>>>> application.</description>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.resourcemanager.webapp.address</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>node1:18088</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <name>yarn.resourcemanager.scheduler.address</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>node1:18030</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>host is the hostname of the
>>>>>>>>>>>>>>>>>> resourcemanager and port
>>>>>>>>>>>>>>>>>> >> >> is
>>>>>>>>>>>>>>>>>> >> >> the port
>>>>>>>>>>>>>>>>>> >> >>     on which the Applications in the cluster talk
>>>>>>>>>>>>>>>>>> to the Resource
>>>>>>>>>>>>>>>>>> >> >> Manager.
>>>>>>>>>>>>>>>>>> >> >>     </description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.resourcemanager.address</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>node1:18040</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>the host is the hostname of the
>>>>>>>>>>>>>>>>>> ResourceManager and
>>>>>>>>>>>>>>>>>> >> >> the
>>>>>>>>>>>>>>>>>> >> >> port is the port on
>>>>>>>>>>>>>>>>>> >> >>     which the clients can talk to the Resource
>>>>>>>>>>>>>>>>>> Manager. </description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.local-dirs</name>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_local_dir</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>the local directories used by the
>>>>>>>>>>>>>>>>>> >> >> nodemanager</description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.address</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>0.0.0.0:18050</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>the nodemanagers bind to this
>>>>>>>>>>>>>>>>>> port</description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>10240</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>the amount of memory on the
>>>>>>>>>>>>>>>>>> NodeManager in
>>>>>>>>>>>>>>>>>> >> >> GB</description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.remote-app-log-dir</name>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_app-logs</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>directory on hdfs where the
>>>>>>>>>>>>>>>>>> application logs are moved
>>>>>>>>>>>>>>>>>> >> >> to
>>>>>>>>>>>>>>>>>> >> >> </description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>    <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.log-dirs</name>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_log</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>the directories used by
>>>>>>>>>>>>>>>>>> Nodemanagers as log
>>>>>>>>>>>>>>>>>> >> >> directories</description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.aux-services</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>mapreduce.shuffle</value>
>>>>>>>>>>>>>>>>>> >> >>     <description>shuffle service that needs to be
>>>>>>>>>>>>>>>>>> set for Map Reduce to
>>>>>>>>>>>>>>>>>> >> >> run </description>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>   <property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <name>yarn.resourcemanager.client.thread-count</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>64</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>  <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.resource.cpu-cores</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>24</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> <property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> <name>yarn.nodemanager.vcores-pcores-ratio</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>3</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>  <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>22000</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>  <property>
>>>>>>>>>>>>>>>>>> >> >>     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>>>>>>>>>>>>>>>>>> >> >>     <value>2.1</value>
>>>>>>>>>>>>>>>>>> >> >>   </property>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >> 2013/6/7 Harsh J <harsh@cloudera.com>
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>> Not tuning configurations at all is wrong. YARN
>>>>>>>>>>>>>>>>>> uses memory resource
>>>>>>>>>>>>>>>>>> >> >>> based scheduling and hence MR2 would be requesting
>>>>>>>>>>>>>>>>>> 1 GB minimum by
>>>>>>>>>>>>>>>>>> >> >>> default, causing, on base configs, to max out at 8
>>>>>>>>>>>>>>>>>> (due to 8 GB NM
>>>>>>>>>>>>>>>>>> >> >>> memory resource config) total containers. Do share
>>>>>>>>>>>>>>>>>> your configs as at
>>>>>>>>>>>>>>>>>> >> >>> this point none of us can tell what it is.
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>> Obviously, it isn't our goal to make MR2 slower
>>>>>>>>>>>>>>>>>> for users and to not
>>>>>>>>>>>>>>>>>> >> >>> care about such things :)
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>> On Fri, Jun 7, 2013 at 8:45 AM, sam liu <
>>>>>>>>>>>>>>>>>> samliuhadoop@gmail.com>
>>>>>>>>>>>>>>>>>> >> >>> wrote:
>>>>>>>>>>>>>>>>>> >> >>> > At the begining, I just want to do a fast
>>>>>>>>>>>>>>>>>> comparision of MRv1 and
>>>>>>>>>>>>>>>>>> >> >>> > Yarn.
>>>>>>>>>>>>>>>>>> >> >>> > But
>>>>>>>>>>>>>>>>>> >> >>> > they have many differences, and to be fair for
>>>>>>>>>>>>>>>>>> comparison I did not
>>>>>>>>>>>>>>>>>> >> >>> > tune
>>>>>>>>>>>>>>>>>> >> >>> > their configurations at all.  So I got above
>>>>>>>>>>>>>>>>>> test results. After
>>>>>>>>>>>>>>>>>> >> >>> > analyzing
>>>>>>>>>>>>>>>>>> >> >>> > the test result, no doubt, I will configure them
>>>>>>>>>>>>>>>>>> and do comparison
>>>>>>>>>>>>>>>>>> >> >>> > again.
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>> > Do you have any idea on current test result? I
>>>>>>>>>>>>>>>>>> think, to compare
>>>>>>>>>>>>>>>>>> >> >>> > with
>>>>>>>>>>>>>>>>>> >> >>> > MRv1,
>>>>>>>>>>>>>>>>>> >> >>> > Yarn is better on Map phase(teragen test), but
>>>>>>>>>>>>>>>>>> worse on Reduce
>>>>>>>>>>>>>>>>>> >> >>> > phase(terasort test).
>>>>>>>>>>>>>>>>>> >> >>> > And any detailed suggestions/comments/materials
>>>>>>>>>>>>>>>>>> on Yarn performance
>>>>>>>>>>>>>>>>>> >> >>> > tunning?
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>> > Thanks!
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>> > 2013/6/7 Marcos Luis Ortiz Valmaseda <
>>>>>>>>>>>>>>>>>> marcosluis2186@gmail.com>
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >> Why not to tune the configurations?
>>>>>>>>>>>>>>>>>> >> >>> >> Both frameworks have many areas to tune:
>>>>>>>>>>>>>>>>>> >> >>> >> - Combiners, Shuffle optimization, Block size,
>>>>>>>>>>>>>>>>>> etc
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >> 2013/6/6 sam liu <samliuhadoop@gmail.com>
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> Hi Experts,
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> We are thinking about whether to use Yarn or
>>>>>>>>>>>>>>>>>> not in the near
>>>>>>>>>>>>>>>>>> >> >>> >>> future,
>>>>>>>>>>>>>>>>>> >> >>> >>> and
>>>>>>>>>>>>>>>>>> >> >>> >>> I ran teragen/terasort on Yarn and MRv1 for
>>>>>>>>>>>>>>>>>> comprison.
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> My env is three nodes cluster, and each node
>>>>>>>>>>>>>>>>>> has similar hardware:
>>>>>>>>>>>>>>>>>> >> >>> >>> 2
>>>>>>>>>>>>>>>>>> >> >>> >>> cpu(4 core), 32 mem. Both Yarn and MRv1
>>>>>>>>>>>>>>>>>> cluster are set on the
>>>>>>>>>>>>>>>>>> >> >>> >>> same
>>>>>>>>>>>>>>>>>> >> >>> >>> env. To
>>>>>>>>>>>>>>>>>> >> >>> >>> be fair, I did not make any performance tuning
>>>>>>>>>>>>>>>>>> on their
>>>>>>>>>>>>>>>>>> >> >>> >>> configurations, but
>>>>>>>>>>>>>>>>>> >> >>> >>> use the default configuration values.
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> Before testing, I think Yarn will be much
>>>>>>>>>>>>>>>>>> better than MRv1, if
>>>>>>>>>>>>>>>>>> >> >>> >>> they
>>>>>>>>>>>>>>>>>> >> >>> >>> all
>>>>>>>>>>>>>>>>>> >> >>> >>> use default configuration, because Yarn is a
>>>>>>>>>>>>>>>>>> better framework than
>>>>>>>>>>>>>>>>>> >> >>> >>> MRv1.
>>>>>>>>>>>>>>>>>> >> >>> >>> However, the test result shows some
>>>>>>>>>>>>>>>>>> differences:
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> MRv1: Hadoop-1.1.1
>>>>>>>>>>>>>>>>>> >> >>> >>> Yarn: Hadoop-2.0.4
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> (A) Teragen: generate 10 GB data:
>>>>>>>>>>>>>>>>>> >> >>> >>> - MRv1: 193 sec
>>>>>>>>>>>>>>>>>> >> >>> >>> - Yarn: 69 sec
>>>>>>>>>>>>>>>>>> >> >>> >>> Yarn is 2.8 times better than MRv1
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> (B) Terasort: sort 10 GB data:
>>>>>>>>>>>>>>>>>> >> >>> >>> - MRv1: 451 sec
>>>>>>>>>>>>>>>>>> >> >>> >>> - Yarn: 1136 sec
>>>>>>>>>>>>>>>>>> >> >>> >>> Yarn is 2.5 times worse than MRv1
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> After a fast analysis, I think the direct
>>>>>>>>>>>>>>>>>> cause might be that Yarn
>>>>>>>>>>>>>>>>>> >> >>> >>> is
>>>>>>>>>>>>>>>>>> >> >>> >>> much faster than MRv1 on Map phase, but much
>>>>>>>>>>>>>>>>>> worse on Reduce
>>>>>>>>>>>>>>>>>> >> >>> >>> phase.
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> Here I have two questions:
>>>>>>>>>>>>>>>>>> >> >>> >>> - Why my tests shows Yarn is worse than MRv1
>>>>>>>>>>>>>>>>>> for terasort?
>>>>>>>>>>>>>>>>>> >> >>> >>> - What's the stratage for tuning Yarn
>>>>>>>>>>>>>>>>>> performance? Is any
>>>>>>>>>>>>>>>>>> >> >>> >>> materials?
>>>>>>>>>>>>>>>>>> >> >>> >>>
>>>>>>>>>>>>>>>>>> >> >>> >>> Thanks!
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >> --
>>>>>>>>>>>>>>>>>> >> >>> >> Marcos Ortiz Valmaseda
>>>>>>>>>>>>>>>>>> >> >>> >> Product Manager at PDVSA
>>>>>>>>>>>>>>>>>> >> >>> >> http://about.me/marcosortiz
>>>>>>>>>>>>>>>>>> >> >>> >>
>>>>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>>>>> >> >>> --
>>>>>>>>>>>>>>>>>> >> >>> Harsh J
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> --
>>>>>>>>>>>>>>>>>> >> Harsh J
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Harsh J
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message