incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gómez Ferro <danie...@yahoo-inc.com>
Subject Re: Deploy S4 App with Yarn -- Memory Problem
Date Fri, 16 Nov 2012 12:19:14 GMT
Hi Frank



On 16/11/12 03:05 , Frank Zheng wrote:
> Hi,
>
> I set these parameters in the yarn-site.xml
> And when I change the minimum limit of memory from 128 to 512, the above
> error is solved.

I think the problem is that we are requesting a container of <minMem> MB 
(128 or 512 in your example) and then using the same value to specify 
the maximum size of the heap in S4YarnClient with the -Xmx parameter.

As Matthieu said, for 128 the heap reaches its maximum size + some MB 
for stacks and this is more than the requested container size of 128, 
hence the termination of the container. My guess is that for 512 the 
heap doesn't need the full 512MB, so it stays below the container maximum.

We should rethink the strategy to choose the right container size based 
on the minimum and maximum values and then the heap size should be set 
to the requested container size _minus_ some MB for extra JVM stuff.

Regards,

Daniel

> But still each container just has 512MB memory.
> So why?
>
> <property>
>   <name>yarn.scheduler.minimum-allocation-mb</name>
>   <value>128</value>
>   <final>true</final>
>    <description>
>     Minimum limit of memory (in MBs) to allocate to each container
> request at the
>     Resource Manager.
>    </description>
> </property>
>
> <property>
>   <name>yarn.scheduler.maximum-allocation-mb</name>
>   <value>3072</value>
>   <final>true</final>
>    <description>
>     Maximum limit of memory (in MBs) to allocate to each container
> request at the
>     Resource Manager.
>    </description>
> </property>
>
> <property>
>   <name>yarn.resourcemanager.nodes.include-path</name>
>   <value>/usr/hadoop/etc/hadoop/slaves</value>
>   <final>true</final>
>    <description>
>     List of permitted/excluede Node Manager nodes
>     (the list file is store in the local filesystem)
>    </description>
> </property>
>
> <!-- NM only configuration -->
> <property>
>   <name>yarn.nodemanager.resource.memory-mb</name>
>   <value>3072</value>
>    <description>
>     Resource i.e. available physical memory, in MB, for given NodeManager
>     Defines total available resources on the NodeManager to be made
> available to
>     running containers.
>    </description>
> </property>
>
> <property>
>   <name>yarn.nodemanager.vmem-pmem-ratio</name>
>   <value>42</value>
>    <description>
>     Maximum ration by which virtual memory usage of tasks may exceed
> physical memory
>    </description>
> </property>
>
> On Fri, Nov 16, 2012 at 2:32 AM, Matthieu Morel
> <matthieu.morel@gmail.com <mailto:matthieu.morel@gmail.com>> wrote:
>
>     Hi,
>
>     What specific parameter are you trying to set in the yarn
>     configuration? It looks like you have a max of 128MB per container.
>
>     What seems to be failing is the application master. It uses (by
>     default) 128MB for heap + default stack size, but that would be over
>     128MB.
>
>     Of course, we still have to properly handle all  parameters available in
>     the yarn command, including memory settings, but from what I see the
>     issue you are facing is related to the configuration of your Yarn
>     cluster. You can get more information from other log files (apart from
>     the client console logs), maybe that can help as well. But somewhere
>     you seem to be capping the container memory to an insufficient value.
>
>     Regards,
>
>     Matthieu
>
>     On Thu, Nov 15, 2012 at 9:12 AM, Frank Zheng
>     <bearzheng2011@gmail.com <mailto:bearzheng2011@gmail.com>> wrote:
>      > Hi All,
>      >
>      > I am confused about the memory configuration.
>      > In the S4ApplicationMaster.java, it says:
>      >         // A resource ask has to be at least the minimum of the
>     capability
>      > of the cluster, the value has to be
>      >         // a multiple of the min value and cannot exceed the max.
>      >         // If it is not an exact multiple of min, the RM will
>     allocate to
>      > the nearest multiple of min
>      > So I set the minimum memory as 128 in the yarn-site.xml
>      > But when I deployed the Twitter Counter application, I got this
>     error.
>      >
>      > 17:04:25.558 [main] INFO  o.apache.s4.tools.yarn.S4YarnClient - Got
>      > application report from ASM for, appId=1, clientToken=null,
>     appDiagnostics=,
>      > appMasterHost=, appQueue=default, appMasterRpcPort=0,
>      > appStartTime=1352970249485, yarnAppState=RUNNING,
>      > distributedFinalState=UNDEFINED, appTrackingUrl=, appUser=root
>      > 17:04:26.560 [main] INFO  o.apache.s4.tools.yarn.S4YarnClient - Got
>      > application report from ASM for, appId=1, clientToken=null,
>     appDiagnostics=,
>      > appMasterHost=, appQueue=default, appMasterRpcPort=0,
>      > appStartTime=1352970249485, yarnAppState=RUNNING,
>      > distributedFinalState=UNDEFINED, appTrackingUrl=, appUser=root
>      > 17:04:27.563 [main] INFO  o.apache.s4.tools.yarn.S4YarnClient - Got
>      > application report from ASM for, appId=1, clientToken=null,
>     appDiagnostics=,
>      > appMasterHost=, appQueue=default, appMasterRpcPort=0,
>      > appStartTime=1352970249485, yarnAppState=RUNNING,
>      > distributedFinalState=UNDEFINED, appTrackingUrl=, appUser=root
>      > 17:04:28.569 [main] INFO  o.apache.s4.tools.yarn.S4YarnClient - Got
>      > application report from ASM for, appId=1, clientToken=null,
>      > appDiagnostics=Application application_1352970173321_0001 failed
>     1 times due
>      > to AM Container for appattempt_1352970173321_0001_000001 exited with
>      > exitCode: 143 due to: Container
>      > [pid=12119,containerID=container_1352970173321_0001_01_000001] is
>     running
>      > beyond physical memory limits. Current usage: 138.2mb of 128.0mb
>     physical
>      > memory used; 1.1gb of 5.2gb virtual memory used. Killing container.
>      > Dump of the process-tree for container_1352970173321_0001_01_000001 :
>      >     |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
>      > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES)
>     FULL_CMD_LINE
>      >     |- 12213 12119 12119 12119 (java) 285 11 1085763584 35085
>     java -Xmx128m
>      > org.apache.s4.tools.yarn.S4ApplicationMaster --container_memory 10
>      > --num_containers 3 --priority 0 -c counter -zk testing.machine1:2181
>      >     |- 12119 9509 12119 12119 (bash) 4 3 108703744 304 /bin/bash
>     -c java
>      > -Xmx128m org.apache.s4.tools.yarn.S4ApplicationMaster
>     --container_memory 10
>      > --num_containers 3 --priority 0 -c counter -zk testing.machine1:2181
>      >
>     1>/home/hadoop/data/log/application_1352970173321_0001/container_1352970173321_0001_01_000001/AppMaster.stdout
>      >
>     2>/home/hadoop/data/log/application_1352970173321_0001/container_1352970173321_0001_01_000001/AppMaster.stderr
>      >
>      >
>      > .Failing this attempt.. Failing the application., appMasterHost=,
>      > appQueue=default, appMasterRpcPort=0, appStartTime=1352970249485,
>      > yarnAppState=FAILED, distributedFinalState=FAILED, appTrackingUrl=,
>      > appUser=root
>      > 17:04:28.569 [main] INFO  o.apache.s4.tools.yarn.S4YarnClient -
>     Application
>      > did not finish. YarnState=FAILED, DSFinalStatus=FAILED. Breaking
>     monitoring
>      > loop
>      > 17:04:28.569 [main] ERROR o.apache.s4.tools.yarn.S4YarnClient -
>     Application
>      > failed to complete successfully
>      >
>      >
>      > Should the S4AppMaster use the multiple of minimum memory
>     automatically? Why
>      > the memory of container is only 128 MB?
>      >
>      >
>      > Sincerely,
>      > Zheng Yu
>      > Mobile:  (852) 60670059
>      > Email: bearzheng2011@gmail.com <mailto:bearzheng2011@gmail.com>
>      >
>      >
>      >
>
>
>
>
> --
> Sincerely,
> Zheng Yu
> Mobile:  (852) 60670059
> Email: bearzheng2011@gmail.com <mailto:bearzheng2011@gmail.com>
>
>
>

Mime
View raw message