hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Misha Dmitriev (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-14938) Configuration.updatingResource map should be initialized lazily
Date Wed, 11 Oct 2017 18:22:00 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Misha Dmitriev updated HADOOP-14938:
    Attachment: HADOOP-14938.02.patch

Addressed Daniel's comments.

I've added {{volatile}} as you suggested, but note that from my understanding of the Java
Memory Model and experience writing concurrent apps, it is not necessary here. What happens
is that since initially {{updatingResource}} is null, and all threads see it as null, as expected.
Once some thread decides to call {{putIntoUpdatingResource()}}, it again first reads this
field it as null, then, after entering the {{synchronized}} block, reads its real value -
which may be non-null if some other thread got ahead and already set this field. What is important
is that when a thread exits the {{synchronized}} block, it has the effect of flushing the
cache to main memory, so that writes made by this thread become visible to other threads.
So even if {{updatingResource}} is not volatile, its new, non-null value will be visible to
all threads that will subsequently read it.

> Configuration.updatingResource map should be initialized lazily
> ---------------------------------------------------------------
>                 Key: HADOOP-14938
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14938
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Misha Dmitriev
>            Assignee: Misha Dmitriev
>         Attachments: HADOOP-14938.01.patch, HADOOP-14938.02.patch
> Using jxray (www.jxray.com), I've analyzed a heap dump of YARN RM running in a big cluster.
The tool uncovered several inefficiencies in the RM memory. It turns out that one of the biggest
sources of memory waste, responsible for almost 1/4 of used memory, is empty ConcurrentHashMap
instances in org.apache.hadoop.conf.Configuration.updatingResource:
> {code}
> 905,551K (24.0%): java.util.concurrent.ConcurrentHashMap: 22118 / 100% of empty 905,551K
> ↖org.apache.hadoop.conf.Configuration.updatingResource
> ↖{j.u.WeakHashMap}.keys
> ↖Java Static org.apache.hadoop.conf.Configuration.REGISTRY
> {code}
> That is, there are 22118 empty ConcurrentHashMaps here, and they collectively waste ~905MB
of memory. This is caused by eager initialization of these maps. To address this problem,
we should initialize them lazily.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message