hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
Date Tue, 24 Oct 2017 03:52:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216259#comment-16216259
] 

Hudson commented on YARN-7320:
------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13127 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13127/])
YARN-7320. Duplicate LiteralByteStrings in (rkanter: rev 5da295a34e39b507e8291073782e0576cd06896a)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatResponsePBImpl.java


> Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
> --------------------------------------------------------------------------------
>
>                 Key: YARN-7320
>                 URL: https://issues.apache.org/jira/browse/YARN-7320
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Misha Dmitriev
>            Assignee: Misha Dmitriev
>             Fix For: 3.0.0
>
>         Attachments: YARN-7320.01.patch, YARN-7320.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN Resource Manager
running in a big cluster. The tool uncovered several sources of memory waste. One problem,
which results in wasting more than a quarter of all memory, is a large number of duplicate
{{LiteralByteString}} objects coming from the following reference chain:
> {code}
> 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
> ↖com.google.protobuf.LiteralByteString.bytes
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
> ↖{j.u.ArrayList}
> ↖j.u.Collections$UnmodifiableRandomAccessList.c
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
> ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
> ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
> ...
> {code}
> That is, collectively reference chains that look as above hold in memory 5.4 million
{{LiteralByteString}} objects, but only ~22 thousand of these objects are unique. Deduplicating
these objects, e.g. using a Google Object Interner instance, would save ~1GB of memory.
> It looks like the main place where the above {{LiteralByteString}}s are created and attached
to the {{SystemCredentialsForAppsProto}} objects is in {{NodeHeartbeatResponsePBImpl.java}},
method {{addSystemCredentialsToProto()}}. Probably adding a call to an interner there will
fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message