Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 356842007D1 for ; Thu, 12 May 2016 13:00:21 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 340A2160868; Thu, 12 May 2016 11:00:21 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2E85D160939 for ; Thu, 12 May 2016 13:00:20 +0200 (CEST) Received: (qmail 36409 invoked by uid 500); 12 May 2016 11:00:19 -0000 Mailing-List: contact users-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@apex.incubator.apache.org Delivered-To: mailing list users@apex.incubator.apache.org Received: (qmail 36399 invoked by uid 99); 12 May 2016 11:00:19 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 May 2016 11:00:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E953D1800B7 for ; Thu, 12 May 2016 11:00:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.2 X-Spam-Level: * X-Spam-Status: No, score=1.2 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id t3aDd8Sj1y4I for ; Thu, 12 May 2016 11:00:14 +0000 (UTC) Received: from mail-io0-f172.google.com (mail-io0-f172.google.com [209.85.223.172]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 9A2545F47E for ; Thu, 12 May 2016 11:00:13 +0000 (UTC) Received: by mail-io0-f172.google.com with SMTP id 190so89851713iow.1 for ; Thu, 12 May 2016 04:00:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=NZaXkE78msKhLcWzlhC75BCpPF3XIJYhmKAWXLRzz0I=; b=UnIKO2s6eoClTA85is6wblwz8cLG+mWIgtug9VdVhHNZTA4A3S1wjmJ882V8vkDDkg gkpz7PQgw6K4OtbgtjLLKUJGGx1sgMOGNkUd02li5Dx/mvajCEt3E/F6S/NA+trhMXP0 gWTUiqy61jSkS2hCPZnvAtGEE23y/2+gspryJEA58k2geTefkNLaUQcy0KQteUsUgioO 80EWXivj23MVIFjXJqLbiQShMnPLRau5vrCAnv3qn9it1LLIvZUq1M3R5EDgOOliZrwL cUZVBPXxoGZiIcIgnvsHUlV2sFQwUbirU4iy5s+sEd9KgYecRauInduBELqiB9etnvu9 HVVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=NZaXkE78msKhLcWzlhC75BCpPF3XIJYhmKAWXLRzz0I=; b=YmRceEOx5oe0vZFFviGCtjhAlEu7dsRMVFpI+CmDp1r8wHa5Pi9R989tOC8wJnSik5 BszfSqmk7QgDmKsR/oNUSqd3jyyyx7SQgDPUGy4NEtsEsPXOwk1IbIa1siCuVYmyPODO ChgXvyV174GCw8wRJpfNgYGrlbovNdFgcXM2qWBp+P0FXgPqQ2JGz+jD8vbOnzOxDCj5 MBn98kIj5OPTM9m255c1hwoUOk0drBhEYwco/sxLS8zYXXH1Vx/1YQtahSyn6v0YxSzU SvRYSnS7c2kGZkT4NicMp79uu9RLl5Kgi7mBNwBpQ0kZjTu0K2zU0wC2BFNF7x9GWlPZ fykA== X-Gm-Message-State: AOPr4FXmQlk9jQD0aEJatgV5omvTNYVNLoxozGYa3oUEQTreCqyYCQon6UttcCSbiLjeRvulgQotphlselzi9Q== MIME-Version: 1.0 X-Received: by 10.107.6.198 with SMTP id f67mr8100567ioi.122.1463050812364; Thu, 12 May 2016 04:00:12 -0700 (PDT) Received: by 10.107.141.208 with HTTP; Thu, 12 May 2016 04:00:12 -0700 (PDT) In-Reply-To: References: Date: Thu, 12 May 2016 21:00:12 +1000 Message-ID: Subject: Re: YARN memory settings and the Apex memory model From: Ananth Gundabattula To: users@apex.incubator.apache.org Content-Type: multipart/alternative; boundary=001a113fb90cf5a1bc0532a30f22 archived-at: Thu, 12 May 2016 11:00:21 -0000 --001a113fb90cf5a1bc0532a30f22 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Shubham. I shall bump up the memory a bit more. I was wondering how the operator memory relates to the YARN container memory settings ? Or it depends on the deployment models ? For example , if the deployment model is thread local, the YARN container needs to be ( considering above example ) configured for atleast memory of 2048 * number of operators + Buffer Server Size ? If the deployment model were not Thread local, it would make YARN container requirements for memory lower per container ? Regards, Ananth On Thu, May 12, 2016 at 7:19 PM, Shubham Pathak wrote: > Hello Ananth, > > Looks like operator requires more memory. > You may add this property to have more memory allocated to the container. > > In properties.xml , for operator O in the application you may specify the > property : > > > dt.operator.*O*.attr.MEMORY_MB > 2048 > > > Thanks, > Shubham > > On Thu, May 12, 2016 at 1:35 PM, Ananth Gundabattula < > agundabattula@gmail.com> wrote: > >> Hello All, >> >> I am seeing the following log from the web ui ocassionally when my >> operators are getting killed. Is there any way I can control the memory >> settings that are used to communicate with YARN when negotiating a >> container ? >> >> How does the typical yarn settings for a container heap and max memory >> relate to the Apex memory allocation model. >> >> The info messages I see in the web UI are as follows: >> >> Container [pid=3D14699,containerID=3Dcontainer_1462863487071_0015_01_000= 012] is running beyond physical memory limits. Current usage: 1.5 GB of 1.5= GB physical memory used; 6.1 GB of 3.1 GB virtual memory used. Killing con= tainer. >> Dump of the process-tree for container_1462863487071_0015_01_000012 : >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(M= ILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE >> |- 14817 14699 14699 14699 (java) 1584 1654 6426968064 393896 /usr/java= /default/bin/java -Xmx4429185024 -Ddt.attr.APPLICATION_PATH=3Dhdfs://dwh109= .qaperf2.sac.int.threatmetrix.com:8020/user/dtadmin/datatorrent/apps/applic= ation_1462863487071_0015 -Djava.io.tmpdir=3D/data3/yarn/nm/usercache/root/a= ppcache/application_1462863487071_0015/container_1462863487071_0015_01_0000= 12/tmp -Ddt.cid=3Dcontainer_1462863487071_0015_01_000012 -Dhadoop.root.logg= er=3DINFO,RFA -Dhadoop.log.dir=3D/data3/yarn/container-logs/application_146= 2863487071_0015/container_1462863487071_0015_01_000012 -Ddt.loggers.level= =3Dcom.datatorrent.*:INFO,org.apache.*:INFO com.datatorrent.stram.engine.St= reamingContainer >> |- 14699 14697 14699 14699 (bash) 1 2 108646400 303 /bin/bash -c /usr/j= ava/default/bin/java -Xmx4429185024 -Ddt.attr.APPLICATION_PATH=3Dhdfs://d= wh109.qaperf2.sac.int.threatmetrix.com:8020/user/dtadmin/datatorrent/apps/a= pplication_1462863487071_0015 -Djava.io.tmpdir=3D/data3/yarn/nm/usercache/r= oot/appcache/application_1462863487071_0015/container_1462863487071_0015_01= _000012/tmp -Ddt.cid=3Dcontainer_1462863487071_0015_01_000012 -Dhadoop.root= .logger=3DINFO,RFA -Dhadoop.log.dir=3D/data3/yarn/container-logs/applicatio= n_1462863487071_0015/container_1462863487071_0015_01_000012 -Ddt.loggers.le= vel=3Dcom.datatorrent.*:INFO,org.apache.*:INFO com.datatorrent.stram.engine= .StreamingContainer 1>/data3/yarn/container-logs/application_1462863487071_= 0015/container_1462863487071_0015_01_000012/stdout 2>/data3/yarn/container-= logs/application_1462863487071_0015/container_1462863487071_0015_01_000012/= stderr >> >> Container killed on request. Exit code is 143 >> Container exited with a non-zero exit code 143 >> >> >> Regards, >> >> Ananth >> >> >> > --001a113fb90cf5a1bc0532a30f22 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks Shubham. I shall bump up the me= mory a bit more.

I was wondering how the operator memory rela= tes to the YARN container memory settings ? Or it depends on the deployment= models ?

For example , if the deployment model is thread loc= al, the YARN container needs to be ( considering above example ) configured= for atleast memory of 2048 * number of operators + Buffer Server Size ?
If the deployment model were not Thread local, it would ma= ke YARN container requirements for memory lower per container ?
<= div>
Regards,
Ananth

On Thu, May 12, 2016 at 7:19 PM, Shubham Pa= thak <shubham@datatorrent.com> wrote:
Hello Ananth,

Looks l= ike operator requires more memory.=C2=A0
You may add this propert= y to have more memory allocated to the container.

= In properties.xml , for operator O in the application you may specify the p= roperty :

<property>
=C2=A0<nam= e>dt.operator.O.attr.MEMORY_MB</name>
<value>2048= </value>
=C2=A0</property>

Thanks,
Shubham

On Thu, May 12, 2016 a= t 1:35 PM, Ananth Gundabattula <agundabattula@gmail.com> wrote:
Hello All,

I am seeing the following log from the web ui= ocassionally when my operators are getting killed. Is there any way=C2=A0 = I can control the memory settings that are used to communicate with YARN wh= en negotiating a container ?

How does the typical yarn = settings for a container heap and max memory relate to the Apex memory allo= cation model.

The info messages I see in the web UI are as fo= llows:

Container [pid=3D14699,containerID=
=3Dcontainer_1462863487071_0015_01_000012] is running beyond physical memor=
y limits. Current usage: 1.5 GB of 1.5 GB physical memory used; 6.1 GB of 3=
.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1462863487071_0015_01_000012 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILL=
IS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 14817 14699 14699 14699 (java) 1584 1654 6426968064 393896 /usr/java/de=
fault/bin/java -Xmx4429185024 -Ddt.attr.APPLICATION_PATH=3Dhdfs://dwh109.qaperf2=
.sac.int.threatmetrix.com:8020/user/dtadmin/datatorrent/apps/application_14=
62863487071_0015 -Djava.io.tmpdir=3D/data3/yarn/nm/usercache/root/appca=
che/application_1462863487071_0015/container_1462863487071_0015_01_000012/t=
mp -Ddt.cid=3Dcontainer_1462863487071_0015_01_000012 -Dhadoop.root.logger=
=3DINFO,RFA -Dhadoop.log.dir=3D/data3/yarn/container-logs/application_14628=
63487071_0015/container_1462863487071_0015_01_000012 -Ddt.loggers.level=3Dc=
om.datatorrent.*:INFO,org.apache.*:INFO com.datatorrent.stram.engine.Stream=
ingContainer=20
	|- 14699 14697 14699 14699 (bash) 1 2 108646400 303 /bin/bash -c /usr/java=
/default/bin/java  -Xmx4429185024  -Ddt.attr.APPLICATION_PATH=3Dhdfs://dwh109.qape=
rf2.sac.int.threatmetrix.com:8020/user/dtadmin/datatorrent/apps/application=
_1462863487071_0015 -Djava.io.tmpdir=3D/data3/yarn/nm/usercache/root/ap=
pcache/application_1462863487071_0015/container_1462863487071_0015_01_00001=
2/tmp -Ddt.cid=3Dcontainer_1462863487071_0015_01_000012 -Dhadoop.root.logge=
r=3DINFO,RFA -Dhadoop.log.dir=3D/data3/yarn/container-logs/application_1462=
863487071_0015/container_1462863487071_0015_01_000012 -Ddt.loggers.level=3D=
com.datatorrent.*:INFO,org.apache.*:INFO com.datatorrent.stram.engine.Strea=
mingContainer 1>/data3/yarn/container-logs/application_1462863487071_001=
5/container_1462863487071_0015_01_000012/stdout 2>/data3/yarn/container-=
logs/application_1462863487071_0015/container_1462863487071_0015_01_000012/=
stderr  =20

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143


Regard=
s,
Ananth 



--001a113fb90cf5a1bc0532a30f22--