commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oliver Heger <oliver.he...@oliver-heger.de>
Subject Re: [configuration] Thoughts about multi-threading
Date Wed, 19 Sep 2012 20:19:37 GMT
Hi Phil,

Am 18.09.2012 20:09, schrieb Phil Steitz:
> On 9/17/12 12:39 PM, Oliver Heger wrote:
>> Hi Jörg,
>>
>> many thanks for your input!
>>
>> Am 17.09.2012 10:01, schrieb Jörg Schaible:
>>> Hi Oliver,
>>>
>>> Oliver Heger wrote:
>>>
>>>> Hi,
>>>>
>>>> one limitation of the 1.x versions of [configuration] is the
>>>> incomplete
>>>> support for concurrent access to Configuration objects. In
>>>> version 2.0
>>>> we should try to improve this.
>>>>
>>>> I have some ideas about this topic - not fully thought out - and
>>>> would
>>>> like to start a discussion. Here they are (in no specific order):
>>>>
>>>> - Thread-safety is not always required. Therefore, I would like
>>>> to take
>>>> an approach similar to the JDK Collections framework: having basic,
>>>> unsynchronized configurations which can be turned into
>>>> thread-safe ones.
>>>
>>> Fair approach.
>>>
>>>> - Many Configuration implementations are based on a hash map.
>>>> Access to
>>>> their content could be made thread-safe by replacing the map by a
>>>> ConcurrentHashMap.
>>>
>>> You could use a protected method to instantiate the underlaying
>>> HashMap.
>>> Then you're free in overloaded
>>> Configurations.synchronizedConfiguration
>>> methods to use a derived class or a wrapper. It has also
>>> implications on
>>> subset().
>> Or pass the map in at construction time. Using a protected method
>> to create the map would either mean that the constructor has to
>> invoke this method (problematic for subclasses which are not yet
>> fully initialized at this time) or the field cannot be made final.
>> Alternatively, there could be an abstract method getMap()
>> returning the reference to the map.
>>
>>>
>>>> - For hierarchical configurations situation is more complex.
>>>> Here we
>>>> will probably need something like a ReadWriteLock to protect their
>>>> content. (There could be different lock implementations including a
>>>> dummy one used by unsynchronized configurations).
>>>>
>>>> - Reloading is a major problem; at least in the way it is
>>>> implemented
>>>> now, it is very hard to get synchronization correct and efficient.
>>>> Therefore, I would like to use a different strategy here. One
>>>> option
>>>> could be to not handle reloading in Configuration objects, but
>>>> on an
>>>> additional layer which creates such objects.
>>>>
>>>> - Other properties of configuration objects (e.g. the
>>>> throwExceptionOnMissing flag or the file name) must be taken into
>>>> account, too. In a typical use case those should not be accessed
>>>> frequently, so it is probably not an issue to always synchronize
>>>> them or
>>>> make them volatile.
>>>>
>>>> Looking forward to your input
>>>
>>> Another option would be immutability (well, apart probably from
>>> reloading).
>>> Personally I have often the use case that I do not want to offer my
>>> clients/consumers to write into the configuration. One approach
>>> can also be
>>> the JDK approach creating Collections.unmodifiableConfiguration.
>>>
>>> However, what also bugs me in the meantime is the current hard
>>> relation
>>> between the configuration object and its format. Why should I
>>> care at all in
>>> what format the configuration had been saved when I access its
>>> values?
>>>
>>> For some time I am thinking now of something in the line of:
>>>
>>> - interface Configuration: Core interfaces, only getters
>>> - interface ReloadableConfiguration extends Configuration,
>>> Reloadable
>>> - class BaseConfiguration: In memory, implements all the stuff for
>>> interpolation and the setters
>>>
>>> - interface ConfigurationSource: Core interface to load (and
>>> probably save a
>>> configuration)
>>> - class PropertiesConfigurationSource: Concrete implementation
>>> that loads a
>>> properties file and creates a BaseConfiguration
>>>
>>> This approach offers immutability for the Configuration itself
>>> and also
>>> allows Serializability. Format is separated completely from the
>>> configuration functionality.
>>>
>>> I know, this looks more like Configuration 3.0 ... ;-)
>>>
>> I really like this approach. I was also thinking about separating
>> loading and saving from core Configuration classes. However, I
>> fear such an approach will make it difficult to preserve the
>> format of a configuration. E.g. XMLConfiguration currently stores
>> the XML document it was loaded from. So when saved to disk, result
>> looks much like the original document.
>>
>> Read-only configurations is also an interesting topic.
>
> This obviously makes the concurrency problem easier :)
>
> Apart from this case, it would be good to agree on exactly what it
> means for [configuration] to be threadsafe.  Is it basically the
> semantics of ConcurrentHashmap?  Or are there sequencing / event
> serialization constraints?  For example, suppose the sequence below
> happens
> Thread A start add property
> Thread B start clear
> Thread A notify property change
> Thread B notify clear
> Thread B clear map
> Thread A update map
>
> Is it OK for this sequence to happen?  Is it OK for A's add to trump
> B's clear even though B's activation started later and B's
> notification was later?

This is a very good point, I did not think about this. It would be very 
confusing for an event listener to receive a clear event and then find 
out that the configuration is not empty.

So I guess it is too naive to simply replace the plain map by a 
concurrent map to gain thread-safety. We will then probably have to use 
a read-write lock for map-based configurations, too. Well, this is not 
too bad. The only thing which worries me a bit is that we have to call 
event listeners while the lock is held. This is an anti-pattern 
described by Bloch: "Don't call an alien method with a lock held!". Does 
anybody has an idea how we could prevent this?

Regarding read-only configurations: I am a big fan of immutable objects, 
also because they are inherently thread-safe. But we will have to study 
the use cases for configuration carefully where we do not need mutability.

This leads me to another point: Isn't it problematic to use mutable 
configurations with reloading? If a reload can happen any time (at least 
- as it is implemented currently - at each property read), there is a 
high danger that updates of the configuration can get lost when a reload 
happens before a save.

Oliver

>
> Phil
>
>>
>> But I think you are right, we have to start with smaller steps
>> first. Not sure whether we can manage this - but I would really
>> like to get something out in the not-too-far future.
>>
>> Oliver
>>
>>> - Jörg
>>>
>>>
>>> ---------------------------------------------------------------------
>>>
>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message