karaf-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Draier (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KARAF-5562) Improve cellar groups configuration synchronisation from hazelcast
Date Mon, 08 Jan 2018 10:20:00 GMT

     [ https://issues.apache.org/jira/browse/KARAF-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Draier updated KARAF-5562:
---------------------------------
    Summary: Improve cellar groups configuration synchronisation from hazelcast  (was: Improve
cellar groups configuration from hazelcast)

> Improve cellar groups configuration synchronisation from hazelcast
> ------------------------------------------------------------------
>
>                 Key: KARAF-5562
>                 URL: https://issues.apache.org/jira/browse/KARAF-5562
>             Project: Karaf
>          Issue Type: Improvement
>          Components: cellar-hazelcast
>    Affects Versions: 4.0.10, 4.1.4
>            Reporter: Thomas Draier
>
> We encountered different issues due to HazelcastGroupManager, I'm grouping them here
as all of them are linked and we fixed them in a single refactoring of the class. This globally
result in a better synchronization of the cellar groups configuration.
> - Hazelcast network splits can result in very bad behaviour on the “groups” shared
map - this map contains the list of groups and its members, and the system fully rely on it
to know in which groups you are. If multiple nodes updates the map while they are not connected
together (easy to reproduce by starting both nodes at the same time), and then join afterwards,
the default merge algorithm is applied and simply overwrite the full map. This basically result
in groups loosing members, even if the configuration file claims that the nodes are still
members. 
> - When handling the groups configuration, HazelcastGroupManager replicates the felix.fileinstall.filename
property on each node, containing the configuration file path. It’s quite “ok” if you’re
on a cluster with each node installed on the exact same path - however if you’re on the
same machine, with 2 nodes on different paths : one node will at one point write on the config
file of the other node and never updates its own config, which can be quite confusing.
> - The HazelcastGroupManager can start even when a configuration is not detected by fileinstall
yet - it then creates a new config, based on the hazelcast shared config, which will override
the config file when fileinstall detects it. It does not have a huge impact, but it shuffles
the properties files and makes it unreadable. 
> - The updates from hazelcast to local config trigger back update on hazelcast which goes
back to local config and sometimes revert the changes, resulting in no change in the config.
Basically , when adding a group, a lot of properties are updated - for each of them we trigger
a configuration update. Each configuration update triggers an event which send the whole config
back to hazelcast, including properties that are not updated yet, setting them back to their
old values. All events (hazelcast updates and osgi config) are treated asynchronously - depending
on the orders of events, some properties can be reverted or never added (usually groups property
is always reverted after a group add). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message