hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3619) ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException
Date Wed, 30 Sep 2015 15:45:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14937022#comment-14937022

Jason Lowe commented on YARN-3619:

My apologies for the long delay, as this fell off my radar.  The approach seems reasonable.

The patch needs to be upmerged to trunk.  In addition I'm wondering about the Timer handling.
 I think the Timer should be a daemon thread (we don't want to prolong NM shutdown due to
this).  Also it seems wasteful to dedicate a separate timer thread for every container that
finished.  It would be more efficient to share a timer that handles multiple timer tasks rather
than spawn a thread for every timer task.

> ContainerMetrics unregisters during getMetrics and leads to ConcurrentModificationException
> -------------------------------------------------------------------------------------------
>                 Key: YARN-3619
>                 URL: https://issues.apache.org/jira/browse/YARN-3619
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.7.0
>            Reporter: Jason Lowe
>            Assignee: zhihai xu
>         Attachments: YARN-3619.000.patch, test.patch
> ContainerMetrics is able to unregister itself during the getMetrics method, but that
method can be called by MetricsSystemImpl.sampleMetrics which is trying to iterate the sources.
 This leads to a ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN impl.MetricsSystemImpl:
> {noformat}

This message was sent by Atlassian JIRA

View raw message