Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D8710107DA for ; Wed, 23 Oct 2013 16:41:15 +0000 (UTC) Received: (qmail 87048 invoked by uid 500); 23 Oct 2013 16:41:05 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 86809 invoked by uid 500); 23 Oct 2013 16:41:05 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 86802 invoked by uid 99); 23 Oct 2013 16:41:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Oct 2013 16:41:04 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of sandy.ryza@cloudera.com designates 209.85.192.177 as permitted sender) Received: from [209.85.192.177] (HELO mail-pd0-f177.google.com) (209.85.192.177) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Oct 2013 16:40:55 +0000 Received: by mail-pd0-f177.google.com with SMTP id p10so1116158pdj.8 for ; Wed, 23 Oct 2013 09:40:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=TBGr2EyYxqKCDk8xRe4jxHRggwhdGSIs4x794awVQfk=; b=Mtiptlhek+iz2VDRswVEOw+7GxoMiRQi4HospJ3ZyYvzpiS1VTgYNC7LPw0H+NJphC YPq1n9pNA1bmTIyc8+QH0ldwGFGyC1dOVkpKo8PYhLdlZc2JeOUlPpO+n65sB+bVYbqH dzt9D4Q21FcwdZ1GJ6XcyEHdofZ9gqe47fs+uBX9qrCtlrJPUxvlDJ3o7522oEqg15CE j2mFJtGUWn+EHnNfO0EpjjiO2O1VPsXiIOkDdRgtoIzjwuqjCQ6nBZV5mIe5Ps5HEVNK TkjDgkZFq+GRKRCz2OtxkTAJZ1k+kR3+Q6E+zzsETyBtcvbF33SprZUN07htVJ7ZV0eg wQlw== X-Gm-Message-State: ALoCoQlA9IhQNzpoWVrsqkxdVwSjE08YA5plBw2O4wnMIshL26AL5fxoudpxNOs6ciBIc0jLnZPY MIME-Version: 1.0 X-Received: by 10.66.25.133 with SMTP id c5mr4241281pag.4.1382546433110; Wed, 23 Oct 2013 09:40:33 -0700 (PDT) Received: by 10.70.52.2 with HTTP; Wed, 23 Oct 2013 09:40:32 -0700 (PDT) In-Reply-To: References: Date: Wed, 23 Oct 2013 09:40:32 -0700 Message-ID: Subject: Re: Why my tests shows Yarn is worse than MRv1 for terasort? From: Sandy Ryza To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec52bed09080d9904e96b2da0 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec52bed09080d9904e96b2da0 Content-Type: text/plain; charset=ISO-8859-1 Based on SLOTS_MILLIS_MAPS, it looks like your map tasks are taking about three times as long in MR2 as they are in MR1. This is probably because you allow twice as many map tasks to run at a time in MR2 (12288/768 = 16). Being able to use all the containers isn't necessarily a good thing if you are oversubscribing your node's resources. Because of the different way that MR1 and MR2 view resources, I think it's better to test with mapred.reduce.slowstart.completed.maps=.99 so that the map and reduce phases will run separately. On the other side, it looks like your MR1 has more spilled records than MR2. For a fairer comparison, you should set io.sort.record.percent in MR1 to .13, which should improve MR1 performance, but will provide a fairer comparison (MR2 automatically does this tuning for you). -Sandy On Wed, Oct 23, 2013 at 9:22 AM, Jian Fang wrote: > The number of map slots and reduce slots on each data node for MR1 are 8 > and 3, respectively. Since MR2 could use all containers for either map or > reduce, I would expect that MR2 is faster. > > > On Wed, Oct 23, 2013 at 8:17 AM, Sandy Ryza wrote: > >> How many map and reduce slots are you using per tasktracker in MR1? How >> do the average map times compare? (MR2 reports this directly on the web UI, >> but you can also get a sense in MR1 by scrolling through the map tasks >> page). Can you share the counters for MR1? >> >> -Sandy >> >> >> On Wed, Oct 23, 2013 at 12:23 AM, Jian Fang < >> jian.fang.subscribe@gmail.com> wrote: >> >>> Unfortunately, turning off JVM reuse still got the same result, i.e., >>> about 90 minutes for MR2. I don't think the killed reduces could contribute >>> to 2 times slowness. There should be something very wrong either in >>> configuration or code. Any hints? >>> >>> >>> >>> On Tue, Oct 22, 2013 at 5:50 PM, Jian Fang < >>> jian.fang.subscribe@gmail.com> wrote: >>> >>>> Thanks Sandy. I will try to turn JVM resue off and see what happens. >>>> >>>> Yes, I saw quite some exceptions in the task attempts. For instance. >>>> >>>> >>>> 2013-10-20 03:13:58,751 ERROR [main] >>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException >>>> as:hadoop (auth:SIMPLE) cause:java.nio.channels.ClosedChannelException >>>> 2013-10-20 03:13:58,752 ERROR [Thread-6] >>>> org.apache.hadoop.hdfs.DFSClient: Failed to close file >>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200 >>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >>>> No lease on >>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200: >>>> File does not exist. Holder >>>> DFSClient_attempt_1382237301855_0001_m_000200_1_872378586_1 does not have >>>> any open files. >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2737) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2801) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2783) >>>> at >>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:611) >>>> at >>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:429) >>>> at >>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48077) >>>> at >>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:582) >>>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) >>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) >>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) >>>> -- >>>> at com.sun.proxy.$Proxy10.complete(Unknown Source) >>>> at >>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:371) >>>> at >>>> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1910) >>>> at >>>> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1896) >>>> at >>>> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:773) >>>> at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:790) >>>> at >>>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:847) >>>> at >>>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2526) >>>> at >>>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2551) >>>> at >>>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) >>>> 2013-10-20 03:13:58,753 WARN [main] org.apache.hadoop.mapred.YarnChild: >>>> Exception running child : java.nio.channels.ClosedChannelException >>>> at >>>> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1325) >>>> at >>>> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98) >>>> at >>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:61) >>>> at java.io.DataOutputStream.write(DataOutputStream.java:107) >>>> at >>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:69) >>>> at >>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:57) >>>> at >>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:646) >>>> at >>>> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) >>>> at >>>> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) >>>> at >>>> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:230) >>>> >>>> >>>> >>>> On Tue, Oct 22, 2013 at 4:45 PM, Sandy Ryza wrote: >>>> >>>>> It looks like many of your reduce tasks were killed. Do you know why? >>>>> Also, MR2 doesn't have JVM reuse, so it might make sense to compare it to >>>>> MR1 with JVM reuse turned off. >>>>> >>>>> -Sandy >>>>> >>>>> >>>>> On Tue, Oct 22, 2013 at 3:06 PM, Jian Fang < >>>>> jian.fang.subscribe@gmail.com> wrote: >>>>> >>>>>> The Terasort output for MR2 is as follows. >>>>>> >>>>>> 2013-10-22 21:40:16,261 INFO org.apache.hadoop.mapreduce.Job (main): >>>>>> Counters: 46 >>>>>> File System Counters >>>>>> FILE: Number of bytes read=456102049355 >>>>>> FILE: Number of bytes written=897246250517 >>>>>> FILE: Number of read operations=0 >>>>>> FILE: Number of large read operations=0 >>>>>> FILE: Number of write operations=0 >>>>>> HDFS: Number of bytes read=1000000851200 >>>>>> HDFS: Number of bytes written=1000000000000 >>>>>> HDFS: Number of read operations=32131 >>>>>> HDFS: Number of large read operations=0 >>>>>> HDFS: Number of write operations=224 >>>>>> Job Counters >>>>>> Killed map tasks=1 >>>>>> Killed reduce tasks=20 >>>>>> Launched map tasks=7601 >>>>>> Launched reduce tasks=132 >>>>>> Data-local map tasks=7591 >>>>>> Rack-local map tasks=10 >>>>>> Total time spent by all maps in occupied slots >>>>>> (ms)=1696141311 >>>>>> Total time spent by all reduces in occupied slots >>>>>> (ms)=2664045096 >>>>>> Map-Reduce Framework >>>>>> Map input records=10000000000 >>>>>> Map output records=10000000000 >>>>>> Map output bytes=1020000000000 >>>>>> Map output materialized bytes=440486356802 >>>>>> Input split bytes=851200 >>>>>> Combine input records=0 >>>>>> Combine output records=0 >>>>>> Reduce input groups=10000000000 >>>>>> Reduce shuffle bytes=440486356802 >>>>>> Reduce input records=10000000000 >>>>>> Reduce output records=10000000000 >>>>>> Spilled Records=20000000000 >>>>>> Shuffled Maps =851200 >>>>>> Failed Shuffles=61 >>>>>> Merged Map outputs=851200 >>>>>> GC time elapsed (ms)=4215666 >>>>>> CPU time spent (ms)=192433000 >>>>>> Physical memory (bytes) snapshot=3349356380160 >>>>>> Virtual memory (bytes) snapshot=9665208745984 >>>>>> Total committed heap usage (bytes)=3636854259712 >>>>>> Shuffle Errors >>>>>> BAD_ID=0 >>>>>> CONNECTION=0 >>>>>> IO_ERROR=4 >>>>>> WRONG_LENGTH=0 >>>>>> WRONG_MAP=0 >>>>>> WRONG_REDUCE=0 >>>>>> File Input Format Counters >>>>>> Bytes Read=1000000000000 >>>>>> File Output Format Counters >>>>>> Bytes Written=1000000000000 >>>>>> >>>>>> Thanks, >>>>>> >>>>>> John >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Oct 22, 2013 at 2:44 PM, Jian Fang < >>>>>> jian.fang.subscribe@gmail.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have the same problem. I compared Hadoop 2.2.0 with Hadoop 1.0.3 >>>>>>> and it turned out that the terasort for MR2 is 2 times slower than that in >>>>>>> MR1. I cannot really believe it. >>>>>>> >>>>>>> The cluster has 20 nodes with 19 data nodes. My Hadoop 2.2.0 >>>>>>> cluster configurations are as follows. >>>>>>> >>>>>>> mapreduce.map.java.opts = "-Xmx512m"; >>>>>>> mapreduce.reduce.java.opts = "-Xmx1536m"; >>>>>>> mapreduce.map.memory.mb = "768"; >>>>>>> mapreduce.reduce.memory.mb = "2048"; >>>>>>> >>>>>>> yarn.scheduler.minimum-allocation-mb = "256"; >>>>>>> yarn.scheduler.maximum-allocation-mb = "8192"; >>>>>>> yarn.nodemanager.resource.memory-mb = "12288"; >>>>>>> yarn.nodemanager.resource.cpu-vcores = "16"; >>>>>>> >>>>>>> mapreduce.reduce.shuffle.parallelcopies = "20"; >>>>>>> mapreduce.task.io.sort.factor = "48"; >>>>>>> mapreduce.task.io.sort.mb = "200"; >>>>>>> mapreduce.map.speculative = "true"; >>>>>>> mapreduce.reduce.speculative = "true"; >>>>>>> mapreduce.framework.name = "yarn"; >>>>>>> yarn.app.mapreduce.am.job.task.listener.thread-count = "60"; >>>>>>> mapreduce.map.cpu.vcores = "1"; >>>>>>> mapreduce.reduce.cpu.vcores = "2"; >>>>>>> >>>>>>> mapreduce.job.jvm.numtasks = "20"; >>>>>>> mapreduce.map.output.compress = "true"; >>>>>>> mapreduce.map.output.compress.codec = >>>>>>> "org.apache.hadoop.io.compress.SnappyCodec"; >>>>>>> >>>>>>> yarn.resourcemanager.client.thread-count = "64"; >>>>>>> yarn.resourcemanager.scheduler.client.thread-count = "64"; >>>>>>> yarn.resourcemanager.resource-tracker.client.thread-count = >>>>>>> "64"; >>>>>>> yarn.resourcemanager.scheduler.class = >>>>>>> "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler"; >>>>>>> yarn.nodemanager.aux-services = "mapreduce_shuffle"; >>>>>>> yarn.nodemanager.aux-services.mapreduce.shuffle.class = >>>>>>> "org.apache.hadoop.mapred.ShuffleHandler"; >>>>>>> yarn.nodemanager.vmem-pmem-ratio = "5"; >>>>>>> yarn.nodemanager.container-executor.class = >>>>>>> "org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor"; >>>>>>> yarn.nodemanager.container-manager.thread-count = "64"; >>>>>>> yarn.nodemanager.localizer.client.thread-count = "20"; >>>>>>> yarn.nodemanager.localizer.fetch.thread-count = "20"; >>>>>>> >>>>>>> My Hadoop 1.0.3 has the same memory/disks/cores and almost the same >>>>>>> other configurations. In MR1, the 1TB terasort took about 45 minutes, but >>>>>>> it took around 90 minutes in MR2. >>>>>>> >>>>>>> Does anyone know what is wrong here? Or do I need some special >>>>>>> configurations for terasort to work better in MR2? >>>>>>> >>>>>>> Thanks in advance, >>>>>>> >>>>>>> John >>>>>>> >>>>>>> >>>>>>> On Tue, Jun 18, 2013 at 3:11 AM, Michel Segel < >>>>>>> michael_segel@hotmail.com> wrote: >>>>>>> >>>>>>>> Sam, >>>>>>>> I think your cluster is too small for any meaningful conclusions to >>>>>>>> be made. >>>>>>>> >>>>>>>> >>>>>>>> Sent from a remote device. Please excuse any typos... >>>>>>>> >>>>>>>> Mike Segel >>>>>>>> >>>>>>>> On Jun 18, 2013, at 3:58 AM, sam liu >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Harsh, >>>>>>>> >>>>>>>> Thanks for your detailed response! Now, the efficiency of my Yarn >>>>>>>> cluster improved a lot after increasing the reducer >>>>>>>> number(mapreduce.job.reduces) in mapred-site.xml. But I still have some >>>>>>>> questions about the way of Yarn to execute MRv1 job: >>>>>>>> >>>>>>>> 1.In Hadoop 1.x, a job will be executed by map task and reduce task >>>>>>>> together, with a typical process(map > shuffle > reduce). In Yarn, as I >>>>>>>> know, a MRv1 job will be executed only by ApplicationMaster. >>>>>>>> - Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1 >>>>>>>> job has special execution process(map > shuffle > reduce) in Hadoop 1.x, >>>>>>>> and how Yarn execute a MRv1 job? still include some special MR steps in >>>>>>>> Hadoop 1.x, like map, sort, merge, combine and shuffle? >>>>>>>> - Do the MRv1 parameters still work for Yarn? Like >>>>>>>> mapreduce.task.io.sort.mb and mapreduce.map.sort.spill.percent? >>>>>>>> - What's the general process for ApplicationMaster of Yarn to >>>>>>>> execute a job? >>>>>>>> >>>>>>>> 2. In Hadoop 1.x, we can set the map/reduce slots by setting >>>>>>>> 'mapred.tasktracker.map.tasks.maximum' and >>>>>>>> 'mapred.tasktracker.reduce.tasks.maximum' >>>>>>>> - For Yarn, above tow parameter do not work any more, as yarn uses >>>>>>>> container instead, right? >>>>>>>> - For Yarn, we can set the whole physical mem for a NodeManager >>>>>>>> using 'yarn.nodemanager.resource.memory-mb'. But how to set the >>>>>>>> default size of physical mem of a container? >>>>>>>> - How to set the maximum size of physical mem of a container? By >>>>>>>> the parameter of 'mapred.child.java.opts'? >>>>>>>> >>>>>>>> Thanks as always! >>>>>>>> >>>>>>>> 2013/6/9 Harsh J >>>>>>>> >>>>>>>>> Hi Sam, >>>>>>>>> >>>>>>>>> > - How to know the container number? Why you say it will be 22 >>>>>>>>> containers due to a 22 GB memory? >>>>>>>>> >>>>>>>>> The MR2's default configuration requests 1 GB resource each for Map >>>>>>>>> and Reduce containers. It requests 1.5 GB for the AM container that >>>>>>>>> runs the job, additionally. This is tunable using the properties >>>>>>>>> Sandy's mentioned in his post. >>>>>>>>> >>>>>>>>> > - My machine has 32 GB memory, how many memory is proper to be >>>>>>>>> assigned to containers? >>>>>>>>> >>>>>>>>> This is a general question. You may use the same process you took >>>>>>>>> to >>>>>>>>> decide optimal number of slots in MR1 to decide this here. Every >>>>>>>>> container is a new JVM, and you're limited by the CPUs you have >>>>>>>>> there >>>>>>>>> (if not the memory). Either increase memory requests from jobs, to >>>>>>>>> lower # of concurrent containers at a given time (runtime change), >>>>>>>>> or >>>>>>>>> lower NM's published memory resources to control the same (config >>>>>>>>> change). >>>>>>>>> >>>>>>>>> > - In mapred-site.xml, if I set 'mapreduce.framework.name' to be >>>>>>>>> 'yarn', will other parameters for mapred-site.xml still work in yarn >>>>>>>>> framework? Like 'mapreduce.task.io.sort.mb' and >>>>>>>>> 'mapreduce.map.sort.spill.percent' >>>>>>>>> >>>>>>>>> Yes, all of these properties will still work. Old properties >>>>>>>>> specific >>>>>>>>> to JobTracker or TaskTracker (usually found as a keyword in the >>>>>>>>> config >>>>>>>>> name) will not apply anymore. >>>>>>>>> >>>>>>>>> On Sun, Jun 9, 2013 at 2:21 PM, sam liu >>>>>>>>> wrote: >>>>>>>>> > Hi Harsh, >>>>>>>>> > >>>>>>>>> > According to above suggestions, I removed the duplication of >>>>>>>>> setting, and >>>>>>>>> > reduce the value of 'yarn.nodemanager.resource.cpu-cores', >>>>>>>>> > 'yarn.nodemanager.vcores-pcores-ratio' and >>>>>>>>> > 'yarn.nodemanager.resource.memory-mb' to 16, 8 and 12000. Ant >>>>>>>>> then, the >>>>>>>>> > efficiency improved about 18%. I have questions: >>>>>>>>> > >>>>>>>>> > - How to know the container number? Why you say it will be 22 >>>>>>>>> containers due >>>>>>>>> > to a 22 GB memory? >>>>>>>>> > - My machine has 32 GB memory, how many memory is proper to be >>>>>>>>> assigned to >>>>>>>>> > containers? >>>>>>>>> > - In mapred-site.xml, if I set 'mapreduce.framework.name' to be >>>>>>>>> 'yarn', will >>>>>>>>> > other parameters for mapred-site.xml still work in yarn >>>>>>>>> framework? Like >>>>>>>>> > 'mapreduce.task.io.sort.mb' and >>>>>>>>> 'mapreduce.map.sort.spill.percent' >>>>>>>>> > >>>>>>>>> > Thanks! >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > 2013/6/8 Harsh J >>>>>>>>> >> >>>>>>>>> >> Hey Sam, >>>>>>>>> >> >>>>>>>>> >> Did you get a chance to retry with Sandy's suggestions? The >>>>>>>>> config >>>>>>>>> >> appears to be asking NMs to use roughly 22 total containers (as >>>>>>>>> >> opposed to 12 total tasks in MR1 config) due to a 22 GB memory >>>>>>>>> >> resource. This could impact much, given the CPU is still the >>>>>>>>> same for >>>>>>>>> >> both test runs. >>>>>>>>> >> >>>>>>>>> >> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza < >>>>>>>>> sandy.ryza@cloudera.com> >>>>>>>>> >> wrote: >>>>>>>>> >> > Hey Sam, >>>>>>>>> >> > >>>>>>>>> >> > Thanks for sharing your results. I'm definitely curious >>>>>>>>> about what's >>>>>>>>> >> > causing the difference. >>>>>>>>> >> > >>>>>>>>> >> > A couple observations: >>>>>>>>> >> > It looks like you've got yarn.nodemanager.resource.memory-mb >>>>>>>>> in there >>>>>>>>> >> > twice >>>>>>>>> >> > with two different values. >>>>>>>>> >> > >>>>>>>>> >> > Your max JVM memory of 1000 MB is (dangerously?) close to the >>>>>>>>> default >>>>>>>>> >> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any of your >>>>>>>>> tasks getting >>>>>>>>> >> > killed for running over resource limits? >>>>>>>>> >> > >>>>>>>>> >> > -Sandy >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> >> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu < >>>>>>>>> samliuhadoop@gmail.com> wrote: >>>>>>>>> >> >> >>>>>>>>> >> >> The terasort execution log shows that reduce spent about 5.5 >>>>>>>>> mins from >>>>>>>>> >> >> 33% >>>>>>>>> >> >> to 35% as below. >>>>>>>>> >> >> 13/06/10 08:02:22 INFO mapreduce.Job: map 100% reduce 31% >>>>>>>>> >> >> 13/06/10 08:02:25 INFO mapreduce.Job: map 100% reduce 32% >>>>>>>>> >> >> 13/06/10 08:02:46 INFO mapreduce.Job: map 100% reduce 33% >>>>>>>>> >> >> 13/06/10 08:08:16 INFO mapreduce.Job: map 100% reduce 35% >>>>>>>>> >> >> 13/06/10 08:08:19 INFO mapreduce.Job: map 100% reduce 40% >>>>>>>>> >> >> 13/06/10 08:08:22 INFO mapreduce.Job: map 100% reduce 43% >>>>>>>>> >> >> >>>>>>>>> >> >> Any way, below are my configurations for your reference. >>>>>>>>> Thanks! >>>>>>>>> >> >> (A) core-site.xml >>>>>>>>> >> >> only define 'fs.default.name' and 'hadoop.tmp.dir' >>>>>>>>> >> >> >>>>>>>>> >> >> (B) hdfs-site.xml >>>>>>>>> >> >> >>>>>>>>> >> >> dfs.replication >>>>>>>>> >> >> 1 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> dfs.name.dir >>>>>>>>> >> >> >>>>>>>>> /opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_name_dir >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> dfs.data.dir >>>>>>>>> >> >> >>>>>>>>> /opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_data_dir >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> dfs.block.size >>>>>>>>> >> >> 134217728 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> dfs.namenode.handler.count >>>>>>>>> >> >> 64 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> dfs.datanode.handler.count >>>>>>>>> >> >> 10 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> (C) mapred-site.xml >>>>>>>>> >> >> >>>>>>>>> >> >> mapreduce.cluster.temp.dir >>>>>>>>> >> >> >>>>>>>>> /opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_temp >>>>>>>>> >> >> No description >>>>>>>>> >> >> true >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> mapreduce.cluster.local.dir >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> /opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_local_dir >>>>>>>>> >> >> No description >>>>>>>>> >> >> true >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> mapreduce.child.java.opts >>>>>>>>> >> >> -Xmx1000m >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> mapreduce.framework.name >>>>>>>>> >> >> yarn >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> mapreduce.tasktracker.map.tasks.maximum >>>>>>>>> >> >> 8 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> mapreduce.tasktracker.reduce.tasks.maximum >>>>>>>>> >> >> 4 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> mapreduce.tasktracker.outofband.heartbeat >>>>>>>>> >> >> true >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> (D) yarn-site.xml >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> yarn.resourcemanager.resource-tracker.address >>>>>>>>> >> >> node1:18025 >>>>>>>>> >> >> host is the hostname of the resource >>>>>>>>> manager and >>>>>>>>> >> >> port is the port on which the NodeManagers contact the >>>>>>>>> Resource >>>>>>>>> >> >> Manager. >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> The address of the RM web >>>>>>>>> application. >>>>>>>>> >> >> yarn.resourcemanager.webapp.address >>>>>>>>> >> >> node1:18088 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.resourcemanager.scheduler.address >>>>>>>>> >> >> node1:18030 >>>>>>>>> >> >> host is the hostname of the resourcemanager >>>>>>>>> and port >>>>>>>>> >> >> is >>>>>>>>> >> >> the port >>>>>>>>> >> >> on which the Applications in the cluster talk to the >>>>>>>>> Resource >>>>>>>>> >> >> Manager. >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.resourcemanager.address >>>>>>>>> >> >> node1:18040 >>>>>>>>> >> >> the host is the hostname of the >>>>>>>>> ResourceManager and >>>>>>>>> >> >> the >>>>>>>>> >> >> port is the port on >>>>>>>>> >> >> which the clients can talk to the Resource Manager. >>>>>>>>> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.local-dirs >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> /opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_local_dir >>>>>>>>> >> >> the local directories used by the >>>>>>>>> >> >> nodemanager >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.address >>>>>>>>> >> >> 0.0.0.0:18050 >>>>>>>>> >> >> the nodemanagers bind to this >>>>>>>>> port >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.resource.memory-mb >>>>>>>>> >> >> 10240 >>>>>>>>> >> >> the amount of memory on the NodeManager in >>>>>>>>> >> >> GB >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.remote-app-log-dir >>>>>>>>> >> >> >>>>>>>>> /opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_app-logs >>>>>>>>> >> >> directory on hdfs where the application >>>>>>>>> logs are moved >>>>>>>>> >> >> to >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.log-dirs >>>>>>>>> >> >> >>>>>>>>> /opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_log >>>>>>>>> >> >> the directories used by Nodemanagers as log >>>>>>>>> >> >> directories >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.aux-services >>>>>>>>> >> >> mapreduce.shuffle >>>>>>>>> >> >> shuffle service that needs to be set for >>>>>>>>> Map Reduce to >>>>>>>>> >> >> run >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.resourcemanager.client.thread-count >>>>>>>>> >> >> 64 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.resource.cpu-cores >>>>>>>>> >> >> 24 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.vcores-pcores-ratio >>>>>>>>> >> >> 3 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.resource.memory-mb >>>>>>>>> >> >> 22000 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> yarn.nodemanager.vmem-pmem-ratio >>>>>>>>> >> >> 2.1 >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> 2013/6/7 Harsh J >>>>>>>>> >> >>> >>>>>>>>> >> >>> Not tuning configurations at all is wrong. YARN uses memory >>>>>>>>> resource >>>>>>>>> >> >>> based scheduling and hence MR2 would be requesting 1 GB >>>>>>>>> minimum by >>>>>>>>> >> >>> default, causing, on base configs, to max out at 8 (due to >>>>>>>>> 8 GB NM >>>>>>>>> >> >>> memory resource config) total containers. Do share your >>>>>>>>> configs as at >>>>>>>>> >> >>> this point none of us can tell what it is. >>>>>>>>> >> >>> >>>>>>>>> >> >>> Obviously, it isn't our goal to make MR2 slower for users >>>>>>>>> and to not >>>>>>>>> >> >>> care about such things :) >>>>>>>>> >> >>> >>>>>>>>> >> >>> On Fri, Jun 7, 2013 at 8:45 AM, sam liu < >>>>>>>>> samliuhadoop@gmail.com> >>>>>>>>> >> >>> wrote: >>>>>>>>> >> >>> > At the begining, I just want to do a fast comparision of >>>>>>>>> MRv1 and >>>>>>>>> >> >>> > Yarn. >>>>>>>>> >> >>> > But >>>>>>>>> >> >>> > they have many differences, and to be fair for comparison >>>>>>>>> I did not >>>>>>>>> >> >>> > tune >>>>>>>>> >> >>> > their configurations at all. So I got above test >>>>>>>>> results. After >>>>>>>>> >> >>> > analyzing >>>>>>>>> >> >>> > the test result, no doubt, I will configure them and do >>>>>>>>> comparison >>>>>>>>> >> >>> > again. >>>>>>>>> >> >>> > >>>>>>>>> >> >>> > Do you have any idea on current test result? I think, to >>>>>>>>> compare >>>>>>>>> >> >>> > with >>>>>>>>> >> >>> > MRv1, >>>>>>>>> >> >>> > Yarn is better on Map phase(teragen test), but worse on >>>>>>>>> Reduce >>>>>>>>> >> >>> > phase(terasort test). >>>>>>>>> >> >>> > And any detailed suggestions/comments/materials on Yarn >>>>>>>>> performance >>>>>>>>> >> >>> > tunning? >>>>>>>>> >> >>> > >>>>>>>>> >> >>> > Thanks! >>>>>>>>> >> >>> > >>>>>>>>> >> >>> > >>>>>>>>> >> >>> > 2013/6/7 Marcos Luis Ortiz Valmaseda < >>>>>>>>> marcosluis2186@gmail.com> >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> Why not to tune the configurations? >>>>>>>>> >> >>> >> Both frameworks have many areas to tune: >>>>>>>>> >> >>> >> - Combiners, Shuffle optimization, Block size, etc >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> 2013/6/6 sam liu >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> Hi Experts, >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> We are thinking about whether to use Yarn or not in the >>>>>>>>> near >>>>>>>>> >> >>> >>> future, >>>>>>>>> >> >>> >>> and >>>>>>>>> >> >>> >>> I ran teragen/terasort on Yarn and MRv1 for comprison. >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> My env is three nodes cluster, and each node has >>>>>>>>> similar hardware: >>>>>>>>> >> >>> >>> 2 >>>>>>>>> >> >>> >>> cpu(4 core), 32 mem. Both Yarn and MRv1 cluster are set >>>>>>>>> on the >>>>>>>>> >> >>> >>> same >>>>>>>>> >> >>> >>> env. To >>>>>>>>> >> >>> >>> be fair, I did not make any performance tuning on their >>>>>>>>> >> >>> >>> configurations, but >>>>>>>>> >> >>> >>> use the default configuration values. >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> Before testing, I think Yarn will be much better than >>>>>>>>> MRv1, if >>>>>>>>> >> >>> >>> they >>>>>>>>> >> >>> >>> all >>>>>>>>> >> >>> >>> use default configuration, because Yarn is a better >>>>>>>>> framework than >>>>>>>>> >> >>> >>> MRv1. >>>>>>>>> >> >>> >>> However, the test result shows some differences: >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> MRv1: Hadoop-1.1.1 >>>>>>>>> >> >>> >>> Yarn: Hadoop-2.0.4 >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> (A) Teragen: generate 10 GB data: >>>>>>>>> >> >>> >>> - MRv1: 193 sec >>>>>>>>> >> >>> >>> - Yarn: 69 sec >>>>>>>>> >> >>> >>> Yarn is 2.8 times better than MRv1 >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> (B) Terasort: sort 10 GB data: >>>>>>>>> >> >>> >>> - MRv1: 451 sec >>>>>>>>> >> >>> >>> - Yarn: 1136 sec >>>>>>>>> >> >>> >>> Yarn is 2.5 times worse than MRv1 >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> After a fast analysis, I think the direct cause might >>>>>>>>> be that Yarn >>>>>>>>> >> >>> >>> is >>>>>>>>> >> >>> >>> much faster than MRv1 on Map phase, but much worse on >>>>>>>>> Reduce >>>>>>>>> >> >>> >>> phase. >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> Here I have two questions: >>>>>>>>> >> >>> >>> - Why my tests shows Yarn is worse than MRv1 for >>>>>>>>> terasort? >>>>>>>>> >> >>> >>> - What's the stratage for tuning Yarn performance? Is >>>>>>>>> any >>>>>>>>> >> >>> >>> materials? >>>>>>>>> >> >>> >>> >>>>>>>>> >> >>> >>> Thanks! >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> >> -- >>>>>>>>> >> >>> >> Marcos Ortiz Valmaseda >>>>>>>>> >> >>> >> Product Manager at PDVSA >>>>>>>>> >> >>> >> http://about.me/marcosortiz >>>>>>>>> >> >>> >> >>>>>>>>> >> >>> > >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> -- >>>>>>>>> >> >>> Harsh J >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> > >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> -- >>>>>>>>> >> Harsh J >>>>>>>>> > >>>>>>>>> > >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Harsh J >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > --bcaec52bed09080d9904e96b2da0 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Based on SLOTS_MILLIS_MAPS, it looks like your map tasks a= re taking about three times as long in MR2 as they are in MR1. =A0This is p= robably because you allow twice as many map tasks to run at a time in MR2 (= 12288/768 =3D 16). =A0Being able to use all the containers isn't neces= sarily a good thing if you are oversubscribing your node's resources. = =A0Because of the different way that MR1 and MR2 view resources, I think it= 's better to test with mapred.reduce.slowstart.completed.maps=3D.99 so = that the map and reduce phases will run separately.

On the other side, it looks like your MR1 has= more spilled records than MR2. =A0For a fairer comparison, you should set = io.sort.record.percent in MR1 to .13, which should improve MR1 performance,= but will provide a fairer comparison (MR2 automatically does this tuning f= or you).

-Sandy


On Wed, Oct 23, 2013 at= 9:22 AM, Jian Fang <jian.fang.subscribe@gmail.com> wrote:
The number of map slots and= reduce slots on each data node for MR1 are 8 and 3, respectively. Since MR= 2 could use all containers for either map or reduce, I would expect that MR= 2 is faster.


On Wed, Oct 23, 2013 a= t 8:17 AM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:=
How many map and reduce slots are you using per tasktracke= r in MR1? =A0How do the average map times compare? (MR2 reports this direct= ly on the web UI, but you can also get a sense in MR1 by scrolling through = the map tasks page). =A0Can you share the counters for MR1?

-Sandy


On Wed, Oct 23, 2013 at 12:23 A= M, Jian Fang <jian.fang.subscribe@gmail.com> wro= te:
Unfortunately, turning off = JVM reuse still got the same result, i.e., about 90 minutes for MR2. I don&= #39;t think the killed reduces could contribute to 2 times slowness. There = should be something very wrong either in configuration or code. Any hints?<= br>


On Tue, Oct 22, 2013 at 5:50 PM, Jian Fang <jian.fang.s= ubscribe@gmail.com> wrote:
Thanks Sandy. I will t= ry to turn JVM resue off and see what happens.

Yes, I saw quit= e some exceptions in the task attempts. For instance.


2013-10-20 03:13:58,751 ERROR [main] org.apache.hadoop.se= curity.UserGroupInformation: PriviledgedActionException as:hadoop (auth:SIM= PLE) cause:java.nio.channels.ClosedChannelException
2013-10-20 03:13:58,752 ERROR [Thread-6] org.apache.hadoop.hdfs.DFSClient: = Failed to close file /1-tb-data/_temporary/1/_temporary/attempt_13822373018= 55_0001_m_000200_1/part-m-00200
org.apache.hadoop.ipc.RemoteException(or= g.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /1= -tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part= -m-00200: File does not exist. Holder DFSClient_attempt_1382237301855_0001_= m_000200_1_872378586_1 does not have any open files.
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.server.namenode.FSNamesyste= m.checkLease(FSNamesystem.java:2737)
=A0=A0=A0=A0=A0=A0=A0 at org.apache= .hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem= .java:2801)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.server.namen= ode.FSNamesystem.completeFile(FSNamesystem.java:2783)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.server.namenode.NameNodeRpc= Server.complete(NameNodeRpcServer.java:611)
=A0=A0=A0=A0=A0=A0=A0 at org= .apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB= .complete(ClientNamenodeProtocolServerSideTranslatorPB.java:429)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.protocol.proto.ClientNameno= deProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenode= ProtocolProtos.java:48077)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.ip= c.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:5= 82)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928= )
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.ipc.Server$Handler$1.run(Se= rver.java:2048)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.ipc.Server$Ha= ndler$1.run(Server.java:2044)
--
=A0=A0=A0=A0=A0=A0=A0 at com.sun.proxy.$Proxy10.complete(Unknown Source)=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeP= rotocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:371)=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.DFSOutputStream.completeF= ile(DFSOutputStream.java:1910)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOu= tputStream.java:1896)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.DF= SClient.closeAllFilesBeingWritten(DFSClient.java:773)
=A0=A0=A0=A0=A0=A0= =A0 at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:790)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.DistributedFileSystem.close= (DistributedFileSystem.java:847)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.had= oop.fs.FileSystem$Cache.closeAll(FileSystem.java:2526)
=A0=A0=A0=A0=A0= =A0=A0 at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSys= tem.java:2551)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.util.ShutdownHookManager$1.run(S= hutdownHookManager.java:54)
2013-10-20 03:13:58,753 WARN [main] org.apac= he.hadoop.mapred.YarnChild: Exception running child : java.nio.channels.Clo= sedChannelException
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.hdfs.DFSOutputStream.checkClosed= (DFSOutputStream.java:1325)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.f= s.FSOutputSummer.write(FSOutputSummer.java:98)
=A0=A0=A0=A0=A0=A0=A0 at = org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStr= eam.java:61)
=A0=A0=A0=A0=A0=A0=A0 at java.io.DataOutputStream.write(DataOutputStream.ja= va:107)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.examples.terasort.Ter= aOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:69)
=A0=A0=A0= =A0=A0=A0=A0 at org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRe= cordWriter.write(TeraOutputFormat.java:57)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.mapred.MapTask$NewDirectOutputCo= llector.write(MapTask.java:646)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hado= op.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextIm= pl.java:89)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.mapreduce.lib.map= .WrappedMapper$Context.write(WrappedMapper.java:112)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.hadoop.examples.terasort.TeraGen$SortGe= nMapper.map(TeraGen.java:230)



On Tue, Oct 22, 2013 at= 4:45 PM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:<= br>
It looks like many of your = reduce tasks were killed. =A0Do you know why? =A0Also, MR2 doesn't have= JVM reuse, so it might make sense to compare it to MR1 with JVM reuse turn= ed off.

-Sandy


On Tue, Oct 22, 2013 at 3:06 PM, Jian Fang <j= ian.fang.subscribe@gmail.com> wrote:
The Terasort outp= ut for MR2 is as follows.
=A0
2013-10-22 21:40:16,261 INFO org.apache= .hadoop.mapreduce.Job (main): Counters: 46
=A0=A0=A0=A0=A0=A0=A0 File System Counters
=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0 FILE: Number of bytes read=3D456102049355
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 FILE: Number of bytes written= =3D897246250517
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 FILE: Numb= er of read operations=3D0
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = FILE: Number of large read operations=3D0
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0 FILE: Number of write operations=3D0
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 HDFS: Number of bytes read=3D= 1000000851200
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 HDFS: Number= of bytes written=3D1000000000000
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0 HDFS: Number of read operations=3D32131
=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0 HDFS: Number of large read operations=3D0
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 HDFS: Number of write operati= ons=3D224
=A0=A0=A0=A0=A0=A0=A0 Job Counters
=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0 Killed map tasks=3D1
=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 Killed reduce tasks=3D20
=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 Launched map tasks=3D7601
=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 Launched reduce tasks=3D132
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Data-local map tasks=3D7591=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Rack-local map tasks=3D10=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Total time spent by all map= s in occupied slots (ms)=3D1696141311
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 Total time spent by all reduces in occupied slots (ms)=3D26640= 45096
=A0=A0=A0=A0=A0=A0=A0 Map-Reduce Framework
=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0 Map input records=3D10000000000
=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0 Map output records=3D10000000000
=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Map output bytes=3D1020000000000
=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Map output materialized bytes=3D= 440486356802
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Input split bytes=3D851200=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Combine input records=3D0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Combine output records=3D0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Reduce input groups=3D10000= 000000
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Reduce shuffle byte= s=3D440486356802
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Reduce input records=3D100000= 00000
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Reduce output record= s=3D10000000000
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Spilled Re= cords=3D20000000000
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Shuffl= ed Maps =3D851200
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Failed S= huffles=3D61
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Merged Map outputs=3D851200=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 GC time elapsed (ms)=3D4215= 666
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 CPU time spent (ms)=3D= 192433000
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Physical memory = (bytes) snapshot=3D3349356380160
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0 Virtual memory (bytes) snapshot=3D9665208745984
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Total committed heap usage (b= ytes)=3D3636854259712
=A0=A0=A0=A0=A0=A0=A0 Shuffle Errors
=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 BAD_ID=3D0
=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 CONNECTION=3D0
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0 IO_ERROR=3D4
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 = WRONG_LENGTH=3D0
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 WRONG_MAP= =3D0
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 WRONG_REDUCE=3D0
=A0=A0=A0= =A0=A0=A0=A0 File Input Format Counters
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0 Bytes Read=3D1000000000000
=A0=A0=A0=A0=A0=A0=A0 File Ou= tput Format Counters
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 Byte= s Written=3D1000000000000

Thanks,

John



On Tue, Oct 22, 2013 at 2:44= PM, Jian Fang <jian.fang.subscribe@gmail.com> w= rote:
Hi,

= I have the same problem. I compared Hadoop 2.2.0 with Hadoop 1.0.3 and it t= urned out that the terasort for MR2 is 2 times slower than that in MR1. I c= annot really believe it.

The cluster has 20 nodes with 19 data nodes.=A0 My Hadoop 2.2.0 clust= er configurations are as follows. =A0=A0=A0=A0=A0

=A0=A0=A0=A0=A0= =A0=A0 mapreduce.map.java.opts =3D "-Xmx512m";
=A0=A0=A0=A0=A0= =A0=A0 mapreduce.reduce.java.opts =3D "-Xmx1536m";
=A0=A0=A0=A0=A0=A0=A0 mapreduce.map.memory.mb =3D "768";
=A0= =A0=A0=A0=A0=A0=A0 mapreduce.reduce.memory.mb =3D "2048";

= =A0=A0=A0=A0=A0=A0=A0 yarn.scheduler.minimum-allocation-mb =3D "256&qu= ot;;
=A0=A0=A0=A0=A0=A0=A0 yarn.scheduler.maximum-allocation-mb =3D &quo= t;8192";
=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.resource.memory-mb =3D "12288&q= uot;;
=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.resource.cpu-vcores =3D &qu= ot;16";

=A0=A0=A0=A0=A0=A0=A0 mapreduce.reduce.shuffle.parallel= copies =3D "20";
=A0=A0=A0=A0=A0=A0=A0 mapreduce.task.io.sort.= factor =3D "48";
=A0=A0=A0=A0=A0=A0=A0 mapreduce.task.io.sort.mb =3D "200";
= =A0=A0=A0=A0=A0=A0=A0 mapreduce.map.speculative =3D "true";
= =A0=A0=A0=A0=A0=A0=A0 mapreduce.reduce.speculative =3D "true";=A0=A0=A0=A0=A0=A0=A0 mapreduce.framework.name =3D "yarn";
=A0=A0=A0=A0=A0=A0=A0 yarn.app.mapreduce.am.job.task.listener.thread-count = =3D "60";
=A0=A0=A0=A0=A0=A0=A0 mapreduce.map.cpu.vcores =3D &= quot;1";
=A0=A0=A0=A0=A0=A0=A0 mapreduce.reduce.cpu.vcores =3D &quo= t;2";

=A0=A0=A0=A0=A0=A0=A0 mapreduce.job.jvm.numtasks =3D &quo= t;20";
=A0=A0=A0=A0=A0=A0=A0 mapreduce.map.output.compress =3D "true";=A0=A0=A0=A0=A0=A0=A0 mapreduce.map.output.compress.codec =3D "org.a= pache.hadoop.io.compress.SnappyCodec";

=A0=A0=A0=A0=A0=A0=A0 ya= rn.resourcemanager.client.thread-count =3D "64";
=A0=A0=A0=A0=A0=A0=A0 yarn.resourcemanager.scheduler.client.thread-count = =3D "64";
=A0=A0=A0=A0=A0=A0=A0 yarn.resourcemanager.resource-= tracker.client.thread-count =3D "64";
=A0=A0=A0=A0=A0=A0=A0 ya= rn.resourcemanager.scheduler.class =3D "org.apache.hadoop.yarn.server.= resourcemanager.scheduler.capacity.CapacityScheduler";
=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.aux-services =3D "mapreduce_shu= ffle";
=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.aux-services.mapreduc= e.shuffle.class =3D "org.apache.hadoop.mapred.ShuffleHandler";=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.vmem-pmem-ratio =3D "5";<= br> =A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.container-executor.class =3D "o= rg.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor";=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.container-manager.thread-count =3D = "64";
=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.localizer.client.= thread-count =3D "20";
=A0=A0=A0=A0=A0=A0=A0 yarn.nodemanager.localizer.fetch.thread-count =3D &qu= ot;20";

My Hadoop 1.0.3 has the same memory/disks/cores a= nd almost the same other configurations. In MR1, the 1TB terasort took abou= t 45 minutes, but it took around 90 minutes in MR2.

Does anyone know what is wrong here? Or do I need some speci= al configurations for terasort to work better in MR2?

Tha= nks in advance,

John


On Tue, Jun 18, 2013 at 3:11 AM, Michel = Segel <michael_segel@hotmail.com> wrote:
Sam,
I think your cluster is too small for= any meaningful conclusions to be made.


Sent from a remot= e device. Please excuse any typos...

Mike Segel

On Jun 18, 2013, at 3:58 AM, sam liu <samliuhadoop@gmail.com>= wrote:

Hi Harsh,

Thanks for your detailed response! No= w, the efficiency of my Yarn cluster improved a lot after increasing the re= ducer number(mapreduce.job.reduces) in mapred-site.xml. But I still have so= me questions about the way of Yarn to execute MRv1 job:

1.In Hadoop 1.x, a job will be executed by map task and reduce task tog= ether, with a typical process(map > shuffle > reduce). In Yarn, as I = know, a MRv1 job will be executed only by ApplicationMaster.
- Yarn coul= d run multiple kinds of jobs(MR, MPI, ...), but, MRv1 job has special execu= tion process(map > shuffle > reduce) in Hadoop 1.x, and how Yarn exec= ute a MRv1 job? still include some special MR steps in Hadoop 1.x, like map= , sort, merge, combine and shuffle?
- Do the MRv1 parameters still work for Yarn? Like mapreduce.task.io.sort.m= b and mapreduce.map.sort.spill.percent?
- What's the general process= for ApplicationMaster of Yarn to execute a job?

2. In Hadoop 1.x, = we can set the map/reduce slots by setting 'mapred.tasktracker.map.task= s.maximum' and 'mapred.tasktracker.reduce.tasks.maximum'
- For Yarn, above tow parameter do not work any more, as yarn uses co= ntainer instead, right?
- For Yarn, we can set the whole physical = mem for a NodeManager using '= yarn.nodemanager.resource.memory-mb'. But how to set the def= ault size of physical mem of a container?
- How to set the maximum size of physical mem of a container? By = the parameter of 'mapred.child.java.opts'?

Thanks as always!

2013/6/9 Ha= rsh J <harsh@cloudera.com>
Hi Sam,

> - How to know the container number? Why you say it will be 22 containe= rs due to a 22 GB memory?

The MR2's default configuration requests 1 GB resource each for M= ap
and Reduce containers. It requests 1.5 GB for the AM container that
runs the job, additionally. This is tunable using the properties
Sandy's mentioned in his post.

> - My machine has 32 GB memory, how many memory is proper to be assigne= d to containers?

This is a general question. You may use the same process you took to<= br> decide optimal number of slots in MR1 to decide this here. Every
container is a new JVM, and you're limited by the CPUs you have there (if not the memory). Either increase memory requests from jobs, to
lower # of concurrent containers at a given time (runtime change), or
lower NM's published memory resources to control the same (config
change).

> - In mapred-site.xml, if I set 'mapreduce.framework.name' to be 'ya= rn', will other parameters for mapred-site.xml still work in yarn frame= work? Like 'mapreduce.task.io.sort.mb' and 'mapreduce.map.sort.= spill.percent'

Yes, all of these properties will still work. Old properties specific=
to JobTracker or TaskTracker (usually found as a keyword in the config
name) will not apply anymore.

On Sun, Jun 9, 2013 at 2:21 PM, sam liu <samliuhadoop@gmail.com> wrote:
> Hi Harsh,
>
> According to above suggestions, I removed the duplication of setting, = and
> reduce the value of 'yarn.nodemanager.resource.cpu-cores',
> 'yarn.nodemanager.vcores-pcores-ratio' and
> 'yarn.nodemanager.resource.memory-mb' to 16, 8 and 12000. Ant = then, the
> efficiency improved about 18%. =A0I have questions:
>
> - How to know the container number? Why you say it will be 22 containe= rs due
> to a 22 GB memory?
> - My machine has 32 GB memory, how many memory is proper to be assigne= d to
> containers?
> - In mapred-site.xml, if I set 'mapreduce.framework.name' to be 'ya= rn', will
> other parameters for mapred-site.xml still work in yarn framework? Lik= e
> 'mapreduce.task.io.sort.mb' and 'mapreduce.map.sort.spill.= percent'
>
> Thanks!
>
>
>
> 2013/6/8 Harsh J <harsh@cloudera.com>
>>
>> Hey Sam,
>>
>> Did you get a chance to retry with Sandy's suggestions? The co= nfig
>> appears to be asking NMs to use roughly 22 total containers (as >> opposed to 12 total tasks in MR1 config) due to a 22 GB memory
>> resource. This could impact much, given the CPU is still the same = for
>> both test runs.
>>
>> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza <sandy.ryza@cloudera.com><= br> >> wrote:
>> > Hey Sam,
>> >
>> > Thanks for sharing your results. =A0I'm definitely curiou= s about what's
>> > causing the difference.
>> >
>> > A couple observations:
>> > It looks like you've got yarn.nodemanager.resource.memory= -mb in there
>> > twice
>> > with two different values.
>> >
>> > Your max JVM memory of 1000 MB is (dangerously?) close to the= default
>> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any of your ta= sks getting
>> > killed for running over resource limits?
>> >
>> > -Sandy
>> >
>> >
>> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu <samliuhadoop@gmail.com> = wrote:
>> >>
>> >> The terasort execution log shows that reduce spent about = 5.5 mins from
>> >> 33%
>> >> to 35% as below.
>> >> 13/06/10 08:02:22 INFO mapreduce.Job: =A0map 100% reduce = 31%
>> >> 13/06/10 08:02:25 INFO mapreduce.Job: =A0map 100% reduce = 32%
>> >> 13/06/10 08:02:46 INFO mapreduce.Job: =A0map 100% reduce = 33%
>> >> 13/06/10 08:08:16 INFO mapreduce.Job: =A0map 100% reduce = 35%
>> >> 13/06/10 08:08:19 INFO mapreduce.Job: =A0map 100% reduce = 40%
>> >> 13/06/10 08:08:22 INFO mapreduce.Job: =A0map 100% reduce = 43%
>> >>
>> >> Any way, below are my configurations for your reference. = Thanks!
>> >> (A) core-site.xml
>> >> only define 'fs.default.name' and 'hadoop.tmp.dir'
>> >>
>> >> (B) hdfs-site.xml
>> >> =A0 <property>
>> >> =A0 =A0 <name>dfs.replication</name>
>> >> =A0 =A0 <value>1</value>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>dfs.name.dir</name>
>> >> =A0 =A0 <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/= dfs_name_dir</value>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>dfs.data.dir</name>
>> >> =A0 =A0 <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/= dfs_data_dir</value>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>dfs.block.size</name>
>> >> =A0 =A0 <value>134217728</value><!-- 128MB= -->
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>dfs.namenode.handler.count</name&g= t;
>> >> =A0 =A0 <value>64</value>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>dfs.datanode.handler.count</name&g= t;
>> >> =A0 =A0 <value>10</value>
>> >> =A0 </property>
>> >>
>> >> (C) mapred-site.xml
>> >> =A0 <property>
>> >> =A0 =A0 <name>mapreduce.cluster.temp.dir</name&g= t;
>> >> =A0 =A0 <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/= mapreduce_temp</value>
>> >> =A0 =A0 <description>No description</description= >
>> >> =A0 =A0 <final>true</final>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>mapreduce.cluster.local.dir</name&= gt;
>> >>
>> >> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduc= e_local_dir</value>
>> >> =A0 =A0 <description>No description</description= >
>> >> =A0 =A0 <final>true</final>
>> >> =A0 </property>
>> >>
>> >> <property>
>> >> =A0 <name>mapreduce.child.java.opts</name> >> >> =A0 <value>-Xmx1000m</value>
>> >> </property>
>> >>
>> >> <property>
>> >> =A0 =A0 <name>mapreduce.framework.name</name>
>> >> =A0 =A0 <value>yarn</value>
>> >> =A0 =A0</property>
>> >>
>> >> =A0<property>
>> >> =A0 =A0 <name>mapreduce.tasktracker.map.tasks.maxim= um</name>
>> >> =A0 =A0 <value>8</value>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>mapreduce.tasktracker.reduce.tasks.ma= ximum</name>
>> >> =A0 =A0 <value>4</value>
>> >> =A0 </property>
>> >>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>mapreduce.tasktracker.outofband.heart= beat</name>
>> >> =A0 =A0 <value>true</value>
>> >> =A0 </property>
>> >>
>> >> (D) yarn-site.xml
>> >> =A0<property>
>> >> =A0 =A0 <name>yarn.resourcemanager.resource-tracker= .address</name>
>> >> =A0 =A0 <value>node1:18025</value>
>> >> =A0 =A0 <description>host is the hostname of the re= source manager and
>> >> =A0 =A0 port is the port on which the NodeManagers contac= t the Resource
>> >> Manager.
>> >> =A0 =A0 </description>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <description>The address of the RM web appl= ication.</description>
>> >> =A0 =A0 <name>yarn.resourcemanager.webapp.address&l= t;/name>
>> >> =A0 =A0 <value>node1:18088</value>
>> >> =A0 </property>
>> >>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.resourcemanager.scheduler.addres= s</name>
>> >> =A0 =A0 <value>node1:18030</value>
>> >> =A0 =A0 <description>host is the hostname of the re= sourcemanager and port
>> >> is
>> >> the port
>> >> =A0 =A0 on which the Applications in the cluster talk to = the Resource
>> >> Manager.
>> >> =A0 =A0 </description>
>> >> =A0 </property>
>> >>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.resourcemanager.address</name= >
>> >> =A0 =A0 <value>node1:18040</value>
>> >> =A0 =A0 <description>the host is the hostname of th= e ResourceManager and
>> >> the
>> >> port is the port on
>> >> =A0 =A0 which the clients can talk to the Resource Manage= r. </description>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.nodemanager.local-dirs</name&= gt;
>> >>
>> >> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_= local_dir</value>
>> >> =A0 =A0 <description>the local directories used by = the
>> >> nodemanager</description>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.nodemanager.address</name>=
>> >> =A0 =A0 <value>0.0.0.0:18050</value>
>> >> =A0 =A0 <description>the nodemanagers bind to this = port</description>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.nodemanager.resource.memory-mb&l= t;/name>
>> >> =A0 =A0 <value>10240</value>
>> >> =A0 =A0 <description>the amount of memory on the No= deManager in
>> >> GB</description>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.nodemanager.remote-app-log-dir&l= t;/name>
>> >> =A0 =A0 <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/= yarn_nm_app-logs</value>
>> >> =A0 =A0 <description>directory on hdfs where the ap= plication logs are moved
>> >> to
>> >> </description>
>> >> =A0 </property>
>> >>
>> >> =A0 =A0<property>
>> >> =A0 =A0 <name>yarn.nodemanager.log-dirs</name>= ;
>> >> =A0 =A0 <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/= yarn_nm_log</value>
>> >> =A0 =A0 <description>the directories used by Nodema= nagers as log
>> >> directories</description>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.nodemanager.aux-services</nam= e>
>> >> =A0 =A0 <value>mapreduce.shuffle</value>
>> >> =A0 =A0 <description>shuffle service that needs to = be set for Map Reduce to
>> >> run </description>
>> >> =A0 </property>
>> >>
>> >> =A0 <property>
>> >> =A0 =A0 <name>yarn.resourcemanager.client.thread-co= unt</name>
>> >> =A0 =A0 <value>64</value>
>> >> =A0 </property>
>> >>
>> >> =A0<property>
>> >> =A0 =A0 <name>yarn.nodemanager.resource.cpu-cores&l= t;/name>
>> >> =A0 =A0 <value>24</value>
>> >> =A0 </property>
>> >>
>> >> <property>
>> >> =A0 =A0 <name>yarn.nodemanager.vcores-pcores-ratio&= lt;/name>
>> >> =A0 =A0 <value>3</value>
>> >> =A0 </property>
>> >>
>> >> =A0<property>
>> >> =A0 =A0 <name>yarn.nodemanager.resource.memory-mb&l= t;/name>
>> >> =A0 =A0 <value>22000</value>
>> >> =A0 </property>
>> >>
>> >> =A0<property>
>> >> =A0 =A0 <name>yarn.nodemanager.vmem-pmem-ratio</= name>
>> >> =A0 =A0 <value>2.1</value>
>> >> =A0 </property>
>> >>
>> >>
>> >>
>> >> 2013/6/7 Harsh J <harsh@cloudera.com>
>> >>>
>> >>> Not tuning configurations at all is wrong. YARN uses = memory resource
>> >>> based scheduling and hence MR2 would be requesting 1 = GB minimum by
>> >>> default, causing, on base configs, to max out at 8 (d= ue to 8 GB NM
>> >>> memory resource config) total containers. Do share yo= ur configs as at
>> >>> this point none of us can tell what it is.
>> >>>
>> >>> Obviously, it isn't our goal to make MR2 slower f= or users and to not
>> >>> care about such things :)
>> >>>
>> >>> On Fri, Jun 7, 2013 at 8:45 AM, sam liu <samliuhadoop@gmail.com= >
>> >>> wrote:
>> >>> > At the begining, I just want to do a fast compar= ision of MRv1 and
>> >>> > Yarn.
>> >>> > But
>> >>> > they have many differences, and to be fair for c= omparison I did not
>> >>> > tune
>> >>> > their configurations at all. =A0So I got above t= est results. After
>> >>> > analyzing
>> >>> > the test result, no doubt, I will configure them= and do comparison
>> >>> > again.
>> >>> >
>> >>> > Do you have any idea on current test result? I t= hink, to compare
>> >>> > with
>> >>> > MRv1,
>> >>> > Yarn is better on Map phase(teragen test), but w= orse on Reduce
>> >>> > phase(terasort test).
>> >>> > And any detailed suggestions/comments/materials = on Yarn performance
>> >>> > tunning?
>> >>> >
>> >>> > Thanks!
>> >>> >
>> >>> >
>> >>> > 2013/6/7 Marcos Luis Ortiz Valmaseda <marcosluis2186@gmail= .com>
>> >>> >>
>> >>> >> Why not to tune the configurations?
>> >>> >> Both frameworks have many areas to tune:
>> >>> >> - Combiners, Shuffle optimization, Block siz= e, etc
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> 2013/6/6 sam liu <samliuhadoop@gmail.com>
>> >>> >>>
>> >>> >>> Hi Experts,
>> >>> >>>
>> >>> >>> We are thinking about whether to use Yar= n or not in the near
>> >>> >>> future,
>> >>> >>> and
>> >>> >>> I ran teragen/terasort on Yarn and MRv1 = for comprison.
>> >>> >>>
>> >>> >>> My env is three nodes cluster, and each = node has similar hardware:
>> >>> >>> 2
>> >>> >>> cpu(4 core), 32 mem. Both Yarn and MRv1 = cluster are set on the
>> >>> >>> same
>> >>> >>> env. To
>> >>> >>> be fair, I did not make any performance = tuning on their
>> >>> >>> configurations, but
>> >>> >>> use the default configuration values. >> >>> >>>
>> >>> >>> Before testing, I think Yarn will be muc= h better than MRv1, if
>> >>> >>> they
>> >>> >>> all
>> >>> >>> use default configuration, because Yarn = is a better framework than
>> >>> >>> MRv1.
>> >>> >>> However, the test result shows some diff= erences:
>> >>> >>>
>> >>> >>> MRv1: Hadoop-1.1.1
>> >>> >>> Yarn: Hadoop-2.0.4
>> >>> >>>
>> >>> >>> (A) Teragen: generate 10 GB data:
>> >>> >>> - MRv1: 193 sec
>> >>> >>> - Yarn: 69 sec
>> >>> >>> Yarn is 2.8 times better than MRv1
>> >>> >>>
>> >>> >>> (B) Terasort: sort 10 GB data:
>> >>> >>> - MRv1: 451 sec
>> >>> >>> - Yarn: 1136 sec
>> >>> >>> Yarn is 2.5 times worse than MRv1
>> >>> >>>
>> >>> >>> After a fast analysis, I think the direc= t cause might be that Yarn
>> >>> >>> is
>> >>> >>> much faster than MRv1 on Map phase, but = much worse on Reduce
>> >>> >>> phase.
>> >>> >>>
>> >>> >>> Here I have two questions:
>> >>> >>> - Why my tests shows Yarn is worse than = MRv1 for terasort?
>> >>> >>> - What's the stratage for tuning Yar= n performance? Is any
>> >>> >>> materials?
>> >>> >>>
>> >>> >>> Thanks!
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Marcos Ortiz Valmaseda
>> >>> >> Product Manager at PDVSA
>> >>> >> http://about.me/marcosortiz
>> >>> >>
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Harsh J
>> >>
>> >>
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



--
Harsh J









--bcaec52bed09080d9904e96b2da0--