hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5368) memory leak at timeline server
Date Tue, 01 Nov 2016 20:09:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15626529#comment-15626529
] 

Jason Lowe commented on YARN-5368:
----------------------------------

bq. Recently I noticed same issue with NodeManger when recovery is enabled.NM RES is keep
on growing which leads ResourceLocalization slow.

We have not seen that on our clusters.  Three minutes is a _really_ long time.  Do you have
gc logging enabled for the nodemanager JVM?  It would be interesting to know if it was trying
to run one or more GC cycles during that time.  If it wasn't GC cycles then I'm not sure how
increased off-heap memory would directly contribute to slower resource localization unless
the machine was near or at the point where it started swapping.

As for the timeline server memory usage, it looks like the rolling level db instances are
starting to pile up, accumulating a lot of off-heap memory.  Pinging [~jeagles] since I vaguely
remember something like this occurring in the past, and there may be a known fix for that
issue.

> memory leak at timeline server
> ------------------------------
>
>                 Key: YARN-5368
>                 URL: https://issues.apache.org/jira/browse/YARN-5368
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: timelineserver
>    Affects Versions: 2.7.1
>         Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>            Reporter: Wataru Yukawa
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn      20   0 28.4g  25g  12m S  0.0 40.1   5162:53 /usr/java/jdk1.8.0_72/bin/java
-Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn
-Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?        Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m
-Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn -Dyarn.log.dir=/var/log/hadoop-yarn/yarn
-Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log
-Dyarn.home.dir= -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA
-Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
-Dyarn.policy.file=hadoop-policy.xml -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
-Dhadoop.log.dir=/var/log/hadoop-yarn/yarn -Dyarn.log.dir=/var/log/hadoop-yarn/yarn -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log
-Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver
-Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA
-Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
-classpath /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn      20   0 3959m 783m  46m S  0.3  1.2   3:37.60 /usr/java/jdk1.8.0_72/bin/java
-Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message