hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brahma Reddy Battula (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5368) memory leak at timeline server
Date Tue, 01 Nov 2016 14:28:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625562#comment-15625562
] 

Brahma Reddy Battula commented on YARN-5368:
--------------------------------------------

[~wyukawa] I believe, this issue with level-db with {{cent-OS 6.7}} only.. Did you get any
workaround for this..?

Recently I noticed same issue with {{NodeManger}} when recovery is enabled.NM RES is keep
on growing which leads {{ResourceLocalization}} slow.

*When NM RES Memory  more ResourceLocalization took ~3 mins* 
{noformat}
2016-10-21 13:48:14,481 | INFO  | LocalizerRunner for container_e08_1476679121221_34954_01_000005
| Writing credentials to the nmPrivate file /srv/BigData/data12/yarn/localdir/nmPrivate/container_e08_1476679121221_34954_01_000005.tokens.
Credentials list:  | ResourceLocalizationService.java:1238
 2016-10-21 13:48:14,487  | INFO  | LocalizerRunner for container_e08_1476679121221_34954_01_000006
| Writing credentials to the nmPrivate file /srv/BigData/data5/yarn/localdir/nmPrivate/container_e08_1476679121221_34954_01_000006.tokens.
Credentials list:  | ResourceLocalizationService.java:1238
 2016-10-21 13:51:40,382  | INFO  | IPC Server handler 3 on 26007 | Resource hdfs://hacluster/tmp/hadoop-yarn/staging/IOCLG/.staging/job_1476679121221_34954/libjars/hbase-server-1.0.2.jar(->/srv/BigData/data22/yarn/localdir/usercache/IOCLG/filecache/557841/hbase-server-1.0.2.jar)
transitioned from DOWNLOADING to LOCALIZED | LocalizedResource.java:203
{noformat}

*When normal ResourceLocalization time* 
{noformat}
2016-10-21 14:19:05,600 | INFO  | LocalizerRunner for container_e10_1477030404479_0013_01_000006
| Writing credentials to the nmPrivate file /srv/BigData/data6/yarn/localdir/nmPrivate/container_e10_1477030404479_0013_01_000006.tokens.
Credentials list:  | ResourceLocalizationService.java:1238
 2016-10-21 14:19:05,600  | INFO  | LocalizerRunner for container_e10_1477030404479_0013_01_000005
| Writing credentials to the nmPrivate file /srv/BigData/data15/yarn/localdir/nmPrivate/container_e10_1477030404479_0013_01_000005.tokens.
Credentials list:  | ResourceLocalizationService.java:1238
 2016-10-21 14:19:07,860  | INFO  | IPC Server handler 2 on 26007 | Resource hdfs://hacluster/tmp/hadoop-yarn/staging/IOCLG/.staging/job_1477030404479_0013/libjars/hbase-server-1.0.2.jar(->/srv/BigData/data15/yarn/localdir/usercache/IOCLG/filecache/558308/hbase-server-1.0.2.jar)
transitioned from DOWNLOADING to LOCALIZED | LocalizedResource.java:203
2016-10-21 14:19:07,898 | INFO  | IPC Server handler 3 on 26007 | Resource hdfs://hacluster/tmp/hadoop-yarn/staging/IOCLG/.staging/job_1477030404479_0013/libjars/hbase-client-1.0.2.jar(->/srv/BigData/data19/yarn/localdir/usercache/IOCLG/filecache/558312/hbase-client-1.0.2.jar)
transitioned from DOWNLOADING to LOCALIZED | LocalizedResource.java:203
{noformat}

I looked at level-db community and did not find any memory leak issue handled after {{1.8}}
release. 

[~jlowe] any thoughts on this..? Thanks.

> memory leak at timeline server
> ------------------------------
>
>                 Key: YARN-5368
>                 URL: https://issues.apache.org/jira/browse/YARN-5368
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: timelineserver
>    Affects Versions: 2.7.1
>         Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>            Reporter: Wataru Yukawa
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn      20   0 28.4g  25g  12m S  0.0 40.1   5162:53 /usr/java/jdk1.8.0_72/bin/java
-Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn
-Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?        Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m
-Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn -Dyarn.log.dir=/var/log/hadoop-yarn/yarn
-Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log
-Dyarn.home.dir= -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA
-Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
-Dyarn.policy.file=hadoop-policy.xml -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
-Dhadoop.log.dir=/var/log/hadoop-yarn/yarn -Dyarn.log.dir=/var/log/hadoop-yarn/yarn -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log
-Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver
-Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA
-Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
-classpath /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn      20   0 3959m 783m  46m S  0.3  1.2   3:37.60 /usr/java/jdk1.8.0_72/bin/java
-Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message