hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Misha Dmitriev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14938) Configuration.updatingResource map should be initialized lazily
Date Fri, 13 Oct 2017 00:49:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202879#comment-16202879

Misha Dmitriev commented on HADOOP-14938:

Thank you Robert. I've fixed checkstyle. Regarding your suggestion about the extra local variable
in the DCL code fragment: what they say in this article may or may not be true as of the latest
JDK versions, but it's certainly plausible. Every access to the volatile field ({{updatingResource}}
in our case) is indeed relatively expensive, because it forces all the memory caches to be
flushed. Thus it makes sense to avoiding one access in the most common case by using a local
variable. In the latest JDK versions the compiler may be smart enough to perform this optimization
automatically - or it may not. So it's better to stay on the safe side and add this local
var, which is what I've done.

I suspect that the difference in performance will probably not be significant here, because
this configuration stuff doesn't look like something done frequently. But it's good to follow
the proper coding patterns anyway.

> Configuration.updatingResource map should be initialized lazily
> ---------------------------------------------------------------
>                 Key: HADOOP-14938
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14938
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Misha Dmitriev
>            Assignee: Misha Dmitriev
>         Attachments: HADOOP-14938.01.patch, HADOOP-14938.02.patch, HADOOP-14938.03.patch
> Using jxray (www.jxray.com), I've analyzed a heap dump of YARN RM running in a big cluster.
The tool uncovered several inefficiencies in the RM memory. It turns out that one of the biggest
sources of memory waste, responsible for almost 1/4 of used memory, is empty ConcurrentHashMap
instances in org.apache.hadoop.conf.Configuration.updatingResource:
> {code}
> 905,551K (24.0%): java.util.concurrent.ConcurrentHashMap: 22118 / 100% of empty 905,551K
> ↖org.apache.hadoop.conf.Configuration.updatingResource
> ↖{j.u.WeakHashMap}.keys
> ↖Java Static org.apache.hadoop.conf.Configuration.REGISTRY
> {code}
> That is, there are 22118 empty ConcurrentHashMaps here, and they collectively waste ~905MB
of memory. This is caused by eager initialization of these maps. To address this problem,
we should initialize them lazily.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message