hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henning Blohm (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-7004) Problem with org.apache.hadoop.conf.Configuration.REGISTRY
Date Thu, 21 Oct 2010 08:14:19 GMT
Problem with org.apache.hadoop.conf.Configuration.REGISTRY

                 Key: HADOOP-7004
                 URL: https://issues.apache.org/jira/browse/HADOOP-7004
             Project: Hadoop Common
          Issue Type: Bug
         Environment: hadoop 0.20.2, hbase 0.20.6
            Reporter: Henning Blohm
            Priority: Minor

When reusing Configuration that has an added addResource(InputStream) a
reload of configuration will fail as the stream has been read before.

The reload gets triggered when addDefaultResource is called. That method
uses the REGISTRY static WeakHashMap to reach out to all reachable Configuration 
instances to reset their properties.

The method addDefaultResource is called by e.g. ConfigUtil in org.apache.hadoop.mapreduce.util
(hadoop trunk) or 
JobConf (hadoop 0.20.2).

The problem has been observed in Hadoop 0.20.2 but the code in trunk has
essentially the same structure. 

There are a few problems here:

1. You cannot safely use addResource(InputStream), if Configuration
objects are to be re-used (you can however use addResource(URL) instead)

2. Modifying the state of Configuration instances at some later point in
time as a side effect of some class initialization in some completely
unrelated thread leads to unpredictable behavior (properties change under the hood)

3. Configuration instances keep context classloaders to find resources.  After redeployment
these may not be "valid" anymore. 
As long as the Configuration instance has not been collected, addDefaultResource will still
invoke reloadConfiguration on them. 
While that is harmless today (only resetting members), this looks like a ticking time bomb.

Define all default resources in Configuration once. Do not hold on to
other configuration instances and do not modify their state as a side
effect of some other activity.

See also: http://osdir.com/ml/general/2010-10/msg25893.html

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message