hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nigel Daley (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-2108) NullPointerException in JVMMetrics for OOM killed task
Date Mon, 29 Oct 2007 05:09:50 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Nigel Daley resolved HADOOP-2108.

       Resolution: Duplicate
    Fix Version/s: 0.14.3

Duplicate of HADOOP-2036 that was fixed in 0.14.3

> NullPointerException in JVMMetrics for OOM killed task
> ------------------------------------------------------
>                 Key: HADOOP-2108
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2108
>             Project: Hadoop
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.14.2
>         Environment: Centos5 jdk1.6.0_02
>            Reporter: Richard Lee
>            Priority: Minor
>             Fix For: 0.14.3
> I had a reduce task run out of memory and die in such a way that JVMMetrics.doThreadUpdates()
throws a NullPointerException.
> The aparent cause seems to be that the call to threadMXBean.getThreadInfo() on JVMMetrics:119
returns an array of ThreadInfo whose elements may be null.
> Here's a relevant quote from the javadoc:
> This method returns an array of the ThreadInfo objects,
>      * each is the thread information about the thread with the same index
>      * as in the ids array.
>      * If a thread of the given ID is not alive or does not exist,
>      * null will be set in the corresponding element 
>      * in the returned array.  A thread is alive if 
>      * it has been started and has not yet died.
> My stacktrace looks like this:
> java.lang.NullPointerException
> 	at org.apache.hadoop.metrics.jvm.JvmMetrics.doThreadUpdates(JvmMetrics.java:129)
> 	at org.apache.hadoop.metrics.jvm.JvmMetrics.doUpdates(JvmMetrics.java:79)
> 	at org.apache.hadoop.metrics.spi.AbstractMetricsContext.timerEvent(AbstractMetricsContext.java:284)
> 	at org.apache.hadoop.metrics.spi.AbstractMetricsContext.access$000(AbstractMetricsContext.java:50)
> 	at org.apache.hadoop.metrics.spi.AbstractMetricsContext$1.run(AbstractMetricsContext.java:249)
> 	at java.util.TimerThread.mainLoop(Timer.java:512)
> 	at java.util.TimerThread.run(Timer.java:462)
> On line 129,  there's an attempt to dereference the potientially null threadInfo value
to get its current state.
> The naive solution here is to check for null and count null values as "terminated"...
but it seems clear that a thread state of TERMINATED and a null ThreadInfo value are distinct
cases and may need special treatment.
> Guessing that this is a "minor" issue because it seems more cosmetic than mission critical.
 I'm not sure what the upstream effects are of this method throwing the NPE, so i didn't set
it to "trivial".

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message