hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jian Fang <jian.fang.subscr...@gmail.com>
Subject Re: Why my tests shows Yarn is worse than MRv1 for terasort?
Date Wed, 23 Oct 2013 17:05:31 GMT
Thanks Sandy.

io.sort.record.percent is the default value 0.05 for both MR1 and MR2.
mapreduce.job.reduce.slowstart.completedmaps in MR2 and
mapred.reduce.slowstart.completed.maps
in MR1 both use the default value 0.05.

I tried to allocate 1536MB and 1024MB to map container some time ago, but
the changes did not give me a better result, thus, I changed it back to
768MB.

Will try mapred.reduce.slowstart.completed.maps=.99 to see what happens.
BTW, I should use
mapreduce.job.reduce.slowstart.completedmaps in MR2, right?

Also, in MR1 I can specify tasktracker.http.threads, but I could not find
the counterpart for MR2. Which one I should tune for the http thread?

Thanks again.


On Wed, Oct 23, 2013 at 9:40 AM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> Based on SLOTS_MILLIS_MAPS, it looks like your map tasks are taking about
> three times as long in MR2 as they are in MR1.  This is probably because
> you allow twice as many map tasks to run at a time in MR2 (12288/768 =
> 16).  Being able to use all the containers isn't necessarily a good thing
> if you are oversubscribing your node's resources.  Because of the different
> way that MR1 and MR2 view resources, I think it's better to test with
> mapred.reduce.slowstart.completed.maps=.99 so that the map and reduce
> phases will run separately.
>
> On the other side, it looks like your MR1 has more spilled records than
> MR2.  For a fairer comparison, you should set io.sort.record.percent in MR1
> to .13, which should improve MR1 performance, but will provide a fairer
> comparison (MR2 automatically does this tuning for you).
>
> -Sandy
>
>
> On Wed, Oct 23, 2013 at 9:22 AM, Jian Fang <jian.fang.subscribe@gmail.com>wrote:
>
>> The number of map slots and reduce slots on each data node for MR1 are 8
>> and 3, respectively. Since MR2 could use all containers for either map or
>> reduce, I would expect that MR2 is faster.
>>
>>
>> On Wed, Oct 23, 2013 at 8:17 AM, Sandy Ryza <sandy.ryza@cloudera.com>wrote:
>>
>>> How many map and reduce slots are you using per tasktracker in MR1?  How
>>> do the average map times compare? (MR2 reports this directly on the web UI,
>>> but you can also get a sense in MR1 by scrolling through the map tasks
>>> page).  Can you share the counters for MR1?
>>>
>>> -Sandy
>>>
>>>
>>> On Wed, Oct 23, 2013 at 12:23 AM, Jian Fang <
>>> jian.fang.subscribe@gmail.com> wrote:
>>>
>>>> Unfortunately, turning off JVM reuse still got the same result, i.e.,
>>>> about 90 minutes for MR2. I don't think the killed reduces could contribute
>>>> to 2 times slowness. There should be something very wrong either in
>>>> configuration or code. Any hints?
>>>>
>>>>
>>>>
>>>> On Tue, Oct 22, 2013 at 5:50 PM, Jian Fang <
>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>
>>>>> Thanks Sandy. I will try to turn JVM resue off and see what happens.
>>>>>
>>>>> Yes, I saw quite some exceptions in the task attempts. For instance.
>>>>>
>>>>>
>>>>> 2013-10-20 03:13:58,751 ERROR [main]
>>>>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>>>>> as:hadoop (auth:SIMPLE) cause:java.nio.channels.ClosedChannelException
>>>>> 2013-10-20 03:13:58,752 ERROR [Thread-6]
>>>>> org.apache.hadoop.hdfs.DFSClient: Failed to close file
>>>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200
>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>>>>> No lease on
>>>>> /1-tb-data/_temporary/1/_temporary/attempt_1382237301855_0001_m_000200_1/part-m-00200:
>>>>> File does not exist. Holder
>>>>> DFSClient_attempt_1382237301855_0001_m_000200_1_872378586_1 does not have
>>>>> any open files.
>>>>>         at
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2737)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:2801)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2783)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:611)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:429)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:48077)
>>>>>         at
>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:582)
>>>>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
>>>>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>>>>> --
>>>>>         at com.sun.proxy.$Proxy10.complete(Unknown Source)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:371)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:1910)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:1896)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:773)
>>>>>         at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:790)
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:847)
>>>>>         at
>>>>> org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2526)
>>>>>         at
>>>>> org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2551)
>>>>>         at
>>>>> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>>>>> 2013-10-20 03:13:58,753 WARN [main]
>>>>> org.apache.hadoop.mapred.YarnChild: Exception running child :
>>>>> java.nio.channels.ClosedChannelException
>>>>>         at
>>>>> org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1325)
>>>>>         at
>>>>> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98)
>>>>>         at
>>>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:61)
>>>>>         at java.io.DataOutputStream.write(DataOutputStream.java:107)
>>>>>         at
>>>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:69)
>>>>>         at
>>>>> org.apache.hadoop.examples.terasort.TeraOutputFormat$TeraRecordWriter.write(TeraOutputFormat.java:57)
>>>>>         at
>>>>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:646)
>>>>>         at
>>>>> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>>>>>         at
>>>>> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>>>>>         at
>>>>> org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper.map(TeraGen.java:230)
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 22, 2013 at 4:45 PM, Sandy Ryza <sandy.ryza@cloudera.com>wrote:
>>>>>
>>>>>> It looks like many of your reduce tasks were killed.  Do you know
>>>>>> why?  Also, MR2 doesn't have JVM reuse, so it might make sense to compare
>>>>>> it to MR1 with JVM reuse turned off.
>>>>>>
>>>>>> -Sandy
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 22, 2013 at 3:06 PM, Jian Fang <
>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>
>>>>>>> The Terasort output for MR2 is as follows.
>>>>>>>
>>>>>>> 2013-10-22 21:40:16,261 INFO org.apache.hadoop.mapreduce.Job (main):
>>>>>>> Counters: 46
>>>>>>>         File System Counters
>>>>>>>                 FILE: Number of bytes read=456102049355
>>>>>>>                 FILE: Number of bytes written=897246250517
>>>>>>>                 FILE: Number of read operations=0
>>>>>>>                 FILE: Number of large read operations=0
>>>>>>>                 FILE: Number of write operations=0
>>>>>>>                 HDFS: Number of bytes read=1000000851200
>>>>>>>                 HDFS: Number of bytes written=1000000000000
>>>>>>>                 HDFS: Number of read operations=32131
>>>>>>>                 HDFS: Number of large read operations=0
>>>>>>>                 HDFS: Number of write operations=224
>>>>>>>         Job Counters
>>>>>>>                 Killed map tasks=1
>>>>>>>                 Killed reduce tasks=20
>>>>>>>                 Launched map tasks=7601
>>>>>>>                 Launched reduce tasks=132
>>>>>>>                 Data-local map tasks=7591
>>>>>>>                 Rack-local map tasks=10
>>>>>>>                 Total time spent by all maps in occupied slots
>>>>>>> (ms)=1696141311
>>>>>>>                 Total time spent by all reduces in occupied slots
>>>>>>> (ms)=2664045096
>>>>>>>         Map-Reduce Framework
>>>>>>>                 Map input records=10000000000
>>>>>>>                 Map output records=10000000000
>>>>>>>                 Map output bytes=1020000000000
>>>>>>>                 Map output materialized bytes=440486356802
>>>>>>>                 Input split bytes=851200
>>>>>>>                 Combine input records=0
>>>>>>>                 Combine output records=0
>>>>>>>                 Reduce input groups=10000000000
>>>>>>>                 Reduce shuffle bytes=440486356802
>>>>>>>                 Reduce input records=10000000000
>>>>>>>                 Reduce output records=10000000000
>>>>>>>                 Spilled Records=20000000000
>>>>>>>                 Shuffled Maps =851200
>>>>>>>                 Failed Shuffles=61
>>>>>>>                 Merged Map outputs=851200
>>>>>>>                 GC time elapsed (ms)=4215666
>>>>>>>                 CPU time spent (ms)=192433000
>>>>>>>                 Physical memory (bytes) snapshot=3349356380160
>>>>>>>                 Virtual memory (bytes) snapshot=9665208745984
>>>>>>>                 Total committed heap usage (bytes)=3636854259712
>>>>>>>         Shuffle Errors
>>>>>>>                 BAD_ID=0
>>>>>>>                 CONNECTION=0
>>>>>>>                 IO_ERROR=4
>>>>>>>                 WRONG_LENGTH=0
>>>>>>>                 WRONG_MAP=0
>>>>>>>                 WRONG_REDUCE=0
>>>>>>>         File Input Format Counters
>>>>>>>                 Bytes Read=1000000000000
>>>>>>>         File Output Format Counters
>>>>>>>                 Bytes Written=1000000000000
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> John
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 22, 2013 at 2:44 PM, Jian Fang <
>>>>>>> jian.fang.subscribe@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have the same problem. I compared Hadoop 2.2.0 with Hadoop 1.0.3
>>>>>>>> and it turned out that the terasort for MR2 is 2 times slower than that in
>>>>>>>> MR1. I cannot really believe it.
>>>>>>>>
>>>>>>>> The cluster has 20 nodes with 19 data nodes.  My Hadoop 2.2.0
>>>>>>>> cluster configurations are as follows.
>>>>>>>>
>>>>>>>>         mapreduce.map.java.opts = "-Xmx512m";
>>>>>>>>         mapreduce.reduce.java.opts = "-Xmx1536m";
>>>>>>>>         mapreduce.map.memory.mb = "768";
>>>>>>>>         mapreduce.reduce.memory.mb = "2048";
>>>>>>>>
>>>>>>>>         yarn.scheduler.minimum-allocation-mb = "256";
>>>>>>>>         yarn.scheduler.maximum-allocation-mb = "8192";
>>>>>>>>         yarn.nodemanager.resource.memory-mb = "12288";
>>>>>>>>         yarn.nodemanager.resource.cpu-vcores = "16";
>>>>>>>>
>>>>>>>>         mapreduce.reduce.shuffle.parallelcopies = "20";
>>>>>>>>         mapreduce.task.io.sort.factor = "48";
>>>>>>>>         mapreduce.task.io.sort.mb = "200";
>>>>>>>>         mapreduce.map.speculative = "true";
>>>>>>>>         mapreduce.reduce.speculative = "true";
>>>>>>>>         mapreduce.framework.name = "yarn";
>>>>>>>>         yarn.app.mapreduce.am.job.task.listener.thread-count = "60";
>>>>>>>>         mapreduce.map.cpu.vcores = "1";
>>>>>>>>         mapreduce.reduce.cpu.vcores = "2";
>>>>>>>>
>>>>>>>>         mapreduce.job.jvm.numtasks = "20";
>>>>>>>>         mapreduce.map.output.compress = "true";
>>>>>>>>         mapreduce.map.output.compress.codec =
>>>>>>>> "org.apache.hadoop.io.compress.SnappyCodec";
>>>>>>>>
>>>>>>>>         yarn.resourcemanager.client.thread-count = "64";
>>>>>>>>         yarn.resourcemanager.scheduler.client.thread-count = "64";
>>>>>>>>         yarn.resourcemanager.resource-tracker.client.thread-count =
>>>>>>>> "64";
>>>>>>>>         yarn.resourcemanager.scheduler.class =
>>>>>>>> "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler";
>>>>>>>>         yarn.nodemanager.aux-services = "mapreduce_shuffle";
>>>>>>>>         yarn.nodemanager.aux-services.mapreduce.shuffle.class =
>>>>>>>> "org.apache.hadoop.mapred.ShuffleHandler";
>>>>>>>>         yarn.nodemanager.vmem-pmem-ratio = "5";
>>>>>>>>         yarn.nodemanager.container-executor.class =
>>>>>>>> "org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor";
>>>>>>>>         yarn.nodemanager.container-manager.thread-count = "64";
>>>>>>>>         yarn.nodemanager.localizer.client.thread-count = "20";
>>>>>>>>         yarn.nodemanager.localizer.fetch.thread-count = "20";
>>>>>>>>
>>>>>>>> My Hadoop 1.0.3 has the same memory/disks/cores and almost the same
>>>>>>>> other configurations. In MR1, the 1TB terasort took about 45 minutes, but
>>>>>>>> it took around 90 minutes in MR2.
>>>>>>>>
>>>>>>>> Does anyone know what is wrong here? Or do I need some special
>>>>>>>> configurations for terasort to work better in MR2?
>>>>>>>>
>>>>>>>> Thanks in advance,
>>>>>>>>
>>>>>>>> John
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 18, 2013 at 3:11 AM, Michel Segel <
>>>>>>>> michael_segel@hotmail.com> wrote:
>>>>>>>>
>>>>>>>>> Sam,
>>>>>>>>> I think your cluster is too small for any meaningful conclusions
>>>>>>>>> to be made.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>>>>>
>>>>>>>>> Mike Segel
>>>>>>>>>
>>>>>>>>> On Jun 18, 2013, at 3:58 AM, sam liu <samliuhadoop@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi Harsh,
>>>>>>>>>
>>>>>>>>> Thanks for your detailed response! Now, the efficiency of my Yarn
>>>>>>>>> cluster improved a lot after increasing the reducer
>>>>>>>>> number(mapreduce.job.reduces) in mapred-site.xml. But I still have some
>>>>>>>>> questions about the way of Yarn to execute MRv1 job:
>>>>>>>>>
>>>>>>>>> 1.In Hadoop 1.x, a job will be executed by map task and reduce
>>>>>>>>> task together, with a typical process(map > shuffle > reduce). In Yarn, as
>>>>>>>>> I know, a MRv1 job will be executed only by ApplicationMaster.
>>>>>>>>> - Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1
>>>>>>>>> job has special execution process(map > shuffle > reduce) in Hadoop 1.x,
>>>>>>>>> and how Yarn execute a MRv1 job? still include some special MR steps in
>>>>>>>>> Hadoop 1.x, like map, sort, merge, combine and shuffle?
>>>>>>>>> - Do the MRv1 parameters still work for Yarn? Like
>>>>>>>>> mapreduce.task.io.sort.mb and mapreduce.map.sort.spill.percent?
>>>>>>>>> - What's the general process for ApplicationMaster of Yarn to
>>>>>>>>> execute a job?
>>>>>>>>>
>>>>>>>>> 2. In Hadoop 1.x, we can set the map/reduce slots by setting
>>>>>>>>> 'mapred.tasktracker.map.tasks.maximum' and
>>>>>>>>> 'mapred.tasktracker.reduce.tasks.maximum'
>>>>>>>>> - For Yarn, above tow parameter do not work any more, as yarn uses
>>>>>>>>> container instead, right?
>>>>>>>>> - For Yarn, we can set the whole physical mem for a NodeManager
>>>>>>>>> using 'yarn.nodemanager.resource.memory-mb'. But how to set the
>>>>>>>>> default size of physical mem of a container?
>>>>>>>>> - How to set the maximum size of physical mem of a container? By
>>>>>>>>> the parameter of 'mapred.child.java.opts'?
>>>>>>>>>
>>>>>>>>> Thanks as always!
>>>>>>>>>
>>>>>>>>> 2013/6/9 Harsh J <harsh@cloudera.com>
>>>>>>>>>
>>>>>>>>>> Hi Sam,
>>>>>>>>>>
>>>>>>>>>> > - How to know the container number? Why you say it will be 22
>>>>>>>>>> containers due to a 22 GB memory?
>>>>>>>>>>
>>>>>>>>>> The MR2's default configuration requests 1 GB resource each for
>>>>>>>>>> Map
>>>>>>>>>> and Reduce containers. It requests 1.5 GB for the AM container
>>>>>>>>>> that
>>>>>>>>>> runs the job, additionally. This is tunable using the properties
>>>>>>>>>> Sandy's mentioned in his post.
>>>>>>>>>>
>>>>>>>>>> > - My machine has 32 GB memory, how many memory is proper to be
>>>>>>>>>> assigned to containers?
>>>>>>>>>>
>>>>>>>>>> This is a general question. You may use the same process you took
>>>>>>>>>> to
>>>>>>>>>> decide optimal number of slots in MR1 to decide this here. Every
>>>>>>>>>> container is a new JVM, and you're limited by the CPUs you have
>>>>>>>>>> there
>>>>>>>>>> (if not the memory). Either increase memory requests from jobs, to
>>>>>>>>>> lower # of concurrent containers at a given time (runtime
>>>>>>>>>> change), or
>>>>>>>>>> lower NM's published memory resources to control the same (config
>>>>>>>>>> change).
>>>>>>>>>>
>>>>>>>>>> > - In mapred-site.xml, if I set 'mapreduce.framework.name' to
>>>>>>>>>> be 'yarn', will other parameters for mapred-site.xml still work in yarn
>>>>>>>>>> framework? Like 'mapreduce.task.io.sort.mb' and
>>>>>>>>>> 'mapreduce.map.sort.spill.percent'
>>>>>>>>>>
>>>>>>>>>> Yes, all of these properties will still work. Old properties
>>>>>>>>>> specific
>>>>>>>>>> to JobTracker or TaskTracker (usually found as a keyword in the
>>>>>>>>>> config
>>>>>>>>>> name) will not apply anymore.
>>>>>>>>>>
>>>>>>>>>> On Sun, Jun 9, 2013 at 2:21 PM, sam liu <samliuhadoop@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> > Hi Harsh,
>>>>>>>>>> >
>>>>>>>>>> > According to above suggestions, I removed the duplication of
>>>>>>>>>> setting, and
>>>>>>>>>> > reduce the value of 'yarn.nodemanager.resource.cpu-cores',
>>>>>>>>>> > 'yarn.nodemanager.vcores-pcores-ratio' and
>>>>>>>>>> > 'yarn.nodemanager.resource.memory-mb' to 16, 8 and 12000. Ant
>>>>>>>>>> then, the
>>>>>>>>>> > efficiency improved about 18%.  I have questions:
>>>>>>>>>> >
>>>>>>>>>> > - How to know the container number? Why you say it will be 22
>>>>>>>>>> containers due
>>>>>>>>>> > to a 22 GB memory?
>>>>>>>>>> > - My machine has 32 GB memory, how many memory is proper to be
>>>>>>>>>> assigned to
>>>>>>>>>> > containers?
>>>>>>>>>> > - In mapred-site.xml, if I set 'mapreduce.framework.name' to
>>>>>>>>>> be 'yarn', will
>>>>>>>>>> > other parameters for mapred-site.xml still work in yarn
>>>>>>>>>> framework? Like
>>>>>>>>>> > 'mapreduce.task.io.sort.mb' and
>>>>>>>>>> 'mapreduce.map.sort.spill.percent'
>>>>>>>>>> >
>>>>>>>>>> > Thanks!
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>> > 2013/6/8 Harsh J <harsh@cloudera.com>
>>>>>>>>>> >>
>>>>>>>>>> >> Hey Sam,
>>>>>>>>>> >>
>>>>>>>>>> >> Did you get a chance to retry with Sandy's suggestions? The
>>>>>>>>>> config
>>>>>>>>>> >> appears to be asking NMs to use roughly 22 total containers (as
>>>>>>>>>> >> opposed to 12 total tasks in MR1 config) due to a 22 GB memory
>>>>>>>>>> >> resource. This could impact much, given the CPU is still the
>>>>>>>>>> same for
>>>>>>>>>> >> both test runs.
>>>>>>>>>> >>
>>>>>>>>>> >> On Fri, Jun 7, 2013 at 12:23 PM, Sandy Ryza <
>>>>>>>>>> sandy.ryza@cloudera.com>
>>>>>>>>>> >> wrote:
>>>>>>>>>> >> > Hey Sam,
>>>>>>>>>> >> >
>>>>>>>>>> >> > Thanks for sharing your results.  I'm definitely curious
>>>>>>>>>> about what's
>>>>>>>>>> >> > causing the difference.
>>>>>>>>>> >> >
>>>>>>>>>> >> > A couple observations:
>>>>>>>>>> >> > It looks like you've got yarn.nodemanager.resource.memory-mb
>>>>>>>>>> in there
>>>>>>>>>> >> > twice
>>>>>>>>>> >> > with two different values.
>>>>>>>>>> >> >
>>>>>>>>>> >> > Your max JVM memory of 1000 MB is (dangerously?) close to
>>>>>>>>>> the default
>>>>>>>>>> >> > mapreduce.map/reduce.memory.mb of 1024 MB. Are any of your
>>>>>>>>>> tasks getting
>>>>>>>>>> >> > killed for running over resource limits?
>>>>>>>>>> >> >
>>>>>>>>>> >> > -Sandy
>>>>>>>>>> >> >
>>>>>>>>>> >> >
>>>>>>>>>> >> > On Thu, Jun 6, 2013 at 10:21 PM, sam liu <
>>>>>>>>>> samliuhadoop@gmail.com> wrote:
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> The terasort execution log shows that reduce spent about
>>>>>>>>>> 5.5 mins from
>>>>>>>>>> >> >> 33%
>>>>>>>>>> >> >> to 35% as below.
>>>>>>>>>> >> >> 13/06/10 08:02:22 INFO mapreduce.Job:  map 100% reduce 31%
>>>>>>>>>> >> >> 13/06/10 08:02:25 INFO mapreduce.Job:  map 100% reduce 32%
>>>>>>>>>> >> >> 13/06/10 08:02:46 INFO mapreduce.Job:  map 100% reduce 33%
>>>>>>>>>> >> >> 13/06/10 08:08:16 INFO mapreduce.Job:  map 100% reduce 35%
>>>>>>>>>> >> >> 13/06/10 08:08:19 INFO mapreduce.Job:  map 100% reduce 40%
>>>>>>>>>> >> >> 13/06/10 08:08:22 INFO mapreduce.Job:  map 100% reduce 43%
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> Any way, below are my configurations for your reference.
>>>>>>>>>> Thanks!
>>>>>>>>>> >> >> (A) core-site.xml
>>>>>>>>>> >> >> only define 'fs.default.name' and 'hadoop.tmp.dir'
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> (B) hdfs-site.xml
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>dfs.replication</name>
>>>>>>>>>> >> >>     <value>1</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>dfs.name.dir</name>
>>>>>>>>>> >> >>
>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_name_dir</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>dfs.data.dir</name>
>>>>>>>>>> >> >>
>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/dfs_data_dir</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>dfs.block.size</name>
>>>>>>>>>> >> >>     <value>134217728</value><!-- 128MB -->
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>dfs.namenode.handler.count</name>
>>>>>>>>>> >> >>     <value>64</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>dfs.datanode.handler.count</name>
>>>>>>>>>> >> >>     <value>10</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> (C) mapred-site.xml
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>mapreduce.cluster.temp.dir</name>
>>>>>>>>>> >> >>
>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_temp</value>
>>>>>>>>>> >> >>     <description>No description</description>
>>>>>>>>>> >> >>     <final>true</final>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>mapreduce.cluster.local.dir</name>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/mapreduce_local_dir</value>
>>>>>>>>>> >> >>     <description>No description</description>
>>>>>>>>>> >> >>     <final>true</final>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> <property>
>>>>>>>>>> >> >>   <name>mapreduce.child.java.opts</name>
>>>>>>>>>> >> >>   <value>-Xmx1000m</value>
>>>>>>>>>> >> >> </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> <property>
>>>>>>>>>> >> >>     <name>mapreduce.framework.name</name>
>>>>>>>>>> >> >>     <value>yarn</value>
>>>>>>>>>> >> >>    </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>  <property>
>>>>>>>>>> >> >>     <name>mapreduce.tasktracker.map.tasks.maximum</name>
>>>>>>>>>> >> >>     <value>8</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>mapreduce.tasktracker.reduce.tasks.maximum</name>
>>>>>>>>>> >> >>     <value>4</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>mapreduce.tasktracker.outofband.heartbeat</name>
>>>>>>>>>> >> >>     <value>true</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> (D) yarn-site.xml
>>>>>>>>>> >> >>  <property>
>>>>>>>>>> >> >>
>>>>>>>>>> <name>yarn.resourcemanager.resource-tracker.address</name>
>>>>>>>>>> >> >>     <value>node1:18025</value>
>>>>>>>>>> >> >>     <description>host is the hostname of the resource
>>>>>>>>>> manager and
>>>>>>>>>> >> >>     port is the port on which the NodeManagers contact the
>>>>>>>>>> Resource
>>>>>>>>>> >> >> Manager.
>>>>>>>>>> >> >>     </description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <description>The address of the RM web
>>>>>>>>>> application.</description>
>>>>>>>>>> >> >>     <name>yarn.resourcemanager.webapp.address</name>
>>>>>>>>>> >> >>     <value>node1:18088</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.resourcemanager.scheduler.address</name>
>>>>>>>>>> >> >>     <value>node1:18030</value>
>>>>>>>>>> >> >>     <description>host is the hostname of the
>>>>>>>>>> resourcemanager and port
>>>>>>>>>> >> >> is
>>>>>>>>>> >> >> the port
>>>>>>>>>> >> >>     on which the Applications in the cluster talk to the
>>>>>>>>>> Resource
>>>>>>>>>> >> >> Manager.
>>>>>>>>>> >> >>     </description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.resourcemanager.address</name>
>>>>>>>>>> >> >>     <value>node1:18040</value>
>>>>>>>>>> >> >>     <description>the host is the hostname of the
>>>>>>>>>> ResourceManager and
>>>>>>>>>> >> >> the
>>>>>>>>>> >> >> port is the port on
>>>>>>>>>> >> >>     which the clients can talk to the Resource Manager.
>>>>>>>>>> </description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.local-dirs</name>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_local_dir</value>
>>>>>>>>>> >> >>     <description>the local directories used by the
>>>>>>>>>> >> >> nodemanager</description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.address</name>
>>>>>>>>>> >> >>     <value>0.0.0.0:18050</value>
>>>>>>>>>> >> >>     <description>the nodemanagers bind to this
>>>>>>>>>> port</description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>>>> >> >>     <value>10240</value>
>>>>>>>>>> >> >>     <description>the amount of memory on the NodeManager in
>>>>>>>>>> >> >> GB</description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.remote-app-log-dir</name>
>>>>>>>>>> >> >>
>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_app-logs</value>
>>>>>>>>>> >> >>     <description>directory on hdfs where the application
>>>>>>>>>> logs are moved
>>>>>>>>>> >> >> to
>>>>>>>>>> >> >> </description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>    <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.log-dirs</name>
>>>>>>>>>> >> >>
>>>>>>>>>> <value>/opt/hadoop-2.0.4-alpha/temp/hadoop/yarn_nm_log</value>
>>>>>>>>>> >> >>     <description>the directories used by Nodemanagers as log
>>>>>>>>>> >> >> directories</description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.aux-services</name>
>>>>>>>>>> >> >>     <value>mapreduce.shuffle</value>
>>>>>>>>>> >> >>     <description>shuffle service that needs to be set for
>>>>>>>>>> Map Reduce to
>>>>>>>>>> >> >> run </description>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>   <property>
>>>>>>>>>> >> >>     <name>yarn.resourcemanager.client.thread-count</name>
>>>>>>>>>> >> >>     <value>64</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>  <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.resource.cpu-cores</name>
>>>>>>>>>> >> >>     <value>24</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.vcores-pcores-ratio</name>
>>>>>>>>>> >> >>     <value>3</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>  <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.resource.memory-mb</name>
>>>>>>>>>> >> >>     <value>22000</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>  <property>
>>>>>>>>>> >> >>     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>>>>>>>>>> >> >>     <value>2.1</value>
>>>>>>>>>> >> >>   </property>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >> 2013/6/7 Harsh J <harsh@cloudera.com>
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>> Not tuning configurations at all is wrong. YARN uses
>>>>>>>>>> memory resource
>>>>>>>>>> >> >>> based scheduling and hence MR2 would be requesting 1 GB
>>>>>>>>>> minimum by
>>>>>>>>>> >> >>> default, causing, on base configs, to max out at 8 (due to
>>>>>>>>>> 8 GB NM
>>>>>>>>>> >> >>> memory resource config) total containers. Do share your
>>>>>>>>>> configs as at
>>>>>>>>>> >> >>> this point none of us can tell what it is.
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>> Obviously, it isn't our goal to make MR2 slower for users
>>>>>>>>>> and to not
>>>>>>>>>> >> >>> care about such things :)
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>> On Fri, Jun 7, 2013 at 8:45 AM, sam liu <
>>>>>>>>>> samliuhadoop@gmail.com>
>>>>>>>>>> >> >>> wrote:
>>>>>>>>>> >> >>> > At the begining, I just want to do a fast comparision of
>>>>>>>>>> MRv1 and
>>>>>>>>>> >> >>> > Yarn.
>>>>>>>>>> >> >>> > But
>>>>>>>>>> >> >>> > they have many differences, and to be fair for
>>>>>>>>>> comparison I did not
>>>>>>>>>> >> >>> > tune
>>>>>>>>>> >> >>> > their configurations at all.  So I got above test
>>>>>>>>>> results. After
>>>>>>>>>> >> >>> > analyzing
>>>>>>>>>> >> >>> > the test result, no doubt, I will configure them and do
>>>>>>>>>> comparison
>>>>>>>>>> >> >>> > again.
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>> > Do you have any idea on current test result? I think, to
>>>>>>>>>> compare
>>>>>>>>>> >> >>> > with
>>>>>>>>>> >> >>> > MRv1,
>>>>>>>>>> >> >>> > Yarn is better on Map phase(teragen test), but worse on
>>>>>>>>>> Reduce
>>>>>>>>>> >> >>> > phase(terasort test).
>>>>>>>>>> >> >>> > And any detailed suggestions/comments/materials on Yarn
>>>>>>>>>> performance
>>>>>>>>>> >> >>> > tunning?
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>> > Thanks!
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>> > 2013/6/7 Marcos Luis Ortiz Valmaseda <
>>>>>>>>>> marcosluis2186@gmail.com>
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >> Why not to tune the configurations?
>>>>>>>>>> >> >>> >> Both frameworks have many areas to tune:
>>>>>>>>>> >> >>> >> - Combiners, Shuffle optimization, Block size, etc
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >> 2013/6/6 sam liu <samliuhadoop@gmail.com>
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> Hi Experts,
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> We are thinking about whether to use Yarn or not in
>>>>>>>>>> the near
>>>>>>>>>> >> >>> >>> future,
>>>>>>>>>> >> >>> >>> and
>>>>>>>>>> >> >>> >>> I ran teragen/terasort on Yarn and MRv1 for comprison.
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> My env is three nodes cluster, and each node has
>>>>>>>>>> similar hardware:
>>>>>>>>>> >> >>> >>> 2
>>>>>>>>>> >> >>> >>> cpu(4 core), 32 mem. Both Yarn and MRv1 cluster are
>>>>>>>>>> set on the
>>>>>>>>>> >> >>> >>> same
>>>>>>>>>> >> >>> >>> env. To
>>>>>>>>>> >> >>> >>> be fair, I did not make any performance tuning on their
>>>>>>>>>> >> >>> >>> configurations, but
>>>>>>>>>> >> >>> >>> use the default configuration values.
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> Before testing, I think Yarn will be much better than
>>>>>>>>>> MRv1, if
>>>>>>>>>> >> >>> >>> they
>>>>>>>>>> >> >>> >>> all
>>>>>>>>>> >> >>> >>> use default configuration, because Yarn is a better
>>>>>>>>>> framework than
>>>>>>>>>> >> >>> >>> MRv1.
>>>>>>>>>> >> >>> >>> However, the test result shows some differences:
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> MRv1: Hadoop-1.1.1
>>>>>>>>>> >> >>> >>> Yarn: Hadoop-2.0.4
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> (A) Teragen: generate 10 GB data:
>>>>>>>>>> >> >>> >>> - MRv1: 193 sec
>>>>>>>>>> >> >>> >>> - Yarn: 69 sec
>>>>>>>>>> >> >>> >>> Yarn is 2.8 times better than MRv1
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> (B) Terasort: sort 10 GB data:
>>>>>>>>>> >> >>> >>> - MRv1: 451 sec
>>>>>>>>>> >> >>> >>> - Yarn: 1136 sec
>>>>>>>>>> >> >>> >>> Yarn is 2.5 times worse than MRv1
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> After a fast analysis, I think the direct cause might
>>>>>>>>>> be that Yarn
>>>>>>>>>> >> >>> >>> is
>>>>>>>>>> >> >>> >>> much faster than MRv1 on Map phase, but much worse on
>>>>>>>>>> Reduce
>>>>>>>>>> >> >>> >>> phase.
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> Here I have two questions:
>>>>>>>>>> >> >>> >>> - Why my tests shows Yarn is worse than MRv1 for
>>>>>>>>>> terasort?
>>>>>>>>>> >> >>> >>> - What's the stratage for tuning Yarn performance? Is
>>>>>>>>>> any
>>>>>>>>>> >> >>> >>> materials?
>>>>>>>>>> >> >>> >>>
>>>>>>>>>> >> >>> >>> Thanks!
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >> --
>>>>>>>>>> >> >>> >> Marcos Ortiz Valmaseda
>>>>>>>>>> >> >>> >> Product Manager at PDVSA
>>>>>>>>>> >> >>> >> http://about.me/marcosortiz
>>>>>>>>>> >> >>> >>
>>>>>>>>>> >> >>> >
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>>
>>>>>>>>>> >> >>> --
>>>>>>>>>> >> >>> Harsh J
>>>>>>>>>> >> >>
>>>>>>>>>> >> >>
>>>>>>>>>> >> >
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> >> --
>>>>>>>>>> >> Harsh J
>>>>>>>>>> >
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Harsh J
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message