hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4233) NPE can happen in RMNMNodeInfo.
Date Thu, 10 May 2012 21:34:54 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272785#comment-13272785
] 

Robert Joseph Evans commented on MAPREDUCE-4233:
------------------------------------------------

Ahmed,

We were using the capacity scheduler at the time.

I debated how I should special case this, and I am fine changing it so let me know if you
feel strongly about it. 

I thought about simply returning a 0 for NumContainers, UsedMemoryMB and AvailableMemoryMB,
because that is what the true values are. no containers are running using no memory and because
the node is unavailable there is no available memory.  This is what the web service at /ws/v1/cluster/nodes
does for these values, but I thought not having them there was a better flag to indicate that
something was wrong with the node.  I don't really want to return something like 
{code}
{
  ...
  "NumContainers" : "Scheduler report unavailable",
  ...
}
{code}
Because they normally return numbers and it doesn't really feel right to me to return a string
when we normally return a number, but returning a -1 as a flag also doesn't feel quite right.
                
> NPE can happen in RMNMNodeInfo.
> -------------------------------
>
>                 Key: MAPREDUCE-4233
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4233
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.23.3
>            Reporter: Robert Joseph Evans
>            Assignee: Robert Joseph Evans
>            Priority: Critical
>         Attachments: MR-4233.txt
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
>         at org.apache.hadoop.yarn.server.resourcemanager.RMNMInfo.getLiveNodeManagers(RMNMInfo.java:96)
>         at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
>         at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
>         at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
>         at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
>         at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
>         at javax.management.StandardMBean.getAttribute(StandardMBean.java:358)
>         at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
> {noformat}
> Looks like rmcontext.getRMNodes() is not kept in sync with scheduler.getNodeReport(),
so that the report can be null even though the context still knowns about the node.
> The simple fix is to add in a null check.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message