cxf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A. Rothman" <>
Subject Fwd: DOSGi - service not exported consistently
Date Wed, 20 Mar 2013 18:25:02 GMT

[The message body disappeared for some reason, resending...]


While trying to use DOSGi (1.4.0, in Karaf 2.3.1) I came across some 
serious issues, namely when starting up a fresh Karaf instance 
repeatedly, only about 50% of the time the service actually got exported 
(as verified with a browser pointing to the appropriate WSDL URL). This 
is just on a single instance on a local host, before trying to do 
anything remotely. I started investigating into the matter, which 
required quite a lot of debugging and code reviewing, and found various 
issues with the (strangely named?) TopologyManagerExport class which 
seems to be at the heart of the matter. This took several iterations and 
includes also non-critical cleanups I did along the way - I'm attaching 
the final version I arrived at which hopefully addresses the issues, 
though due to the law of clustering of bugs, I wouldn't be surprised if 
there are other new or old bugs still lurking there... any feedback and 
additional code reviews would be greatly appreciated.

Here is a summary of all the changes, big and small:

- fixed typos in docs and logs.
- fixed formatting to be more consistent and readable.
- added debug logs to RemoteServiceAdminLifeCycleListener.added/removed 
callbacks (this proved the serious tracker bug below, but is also useful 
for general debugging).
- simplified shouldExportService returned boolean expression.
- replaced iterations on map keys followed by a get of the corresponding 
value with an iteration on Map.Entry (prevents unnecessary lookups).
- replaced map containsKey checks followed by get with a get followed by 
null check (prevents unnecessary lookups, and theoretically race 
conditions too).
- fixed tracker misuse: when tracker callback (added) is called, the 
tracker itself does not yet reflect the new state, so accessing services 
(or size) on the tracker from within the callback will fail to include 
the new service. Although the actual service export triggered by the 
callback is performed on a separate thread, this still leaves a race 
condition between the two which sometimes causes the export to fail 
(with "No RemoteServiceAdmin available!" log). To fix this, the newly 
added RSA is passed down the method chain to the doExportService method 
explicitly. doExportService then uses the reference if it is given, or 
if null (as when called from elsewhere), uses the tracker to get the 
service list as it did in the past.
- fixed exportService always overwriting a service's entry in 
exportedServices with a new empty data structure - now it only does so 
if it does not already exist.
- fixed exportedServices inner maps to be created as synchronized maps - 
previously doExportService wrapped it in a synchronized map on each 
invocation, which results effectively in no synchronization, since the 
synchronized wrapper map is local only so each thread accessing the map 
uses a different lock. Also fixed other accesses of the inner maps to be 
synchronized (Collections.synchronizedMap requires external 
synchronization on it when being iterated, as specified in the docs).
- fixed exports null check in doExportService (in the old code, 
Collections.synchronizedMap never returns null; In the new code it makes 
more sense.)
- removed redundant remoteServiceAdminTracker null check in 
doExportService (it is already used in the constructor, so can't be null 
at this point).
- added cast of null to String when calling getServiceReferences in 
exportExistingServices, so that it will compile also with the OSGi 4.3.0 
and later API (which added an overloaded method accepting a Class). Does 
not affect previous versions.
- removed removeExportRegistration, removeExportReference and 
remoteAdminEvent methods, all of which are both unused and have a no-op 

Feel free to pick out the changes u like...

In addition, I had a few related questions/thoughts, I hope someone here 
can help with them:

1. Is DOSGi being updated to work with OSGi 4.3.0/Felix 4.x/Karaf 2.3.x 
etc.? It seems to be slightly outdated. At a glance, it looks like 
making it compile against OSGi 4.3.0 requires just a few 
backward-compatible syntax changes, and hopefully the semantics have not 
changed too much.

2. The website says on the Karaf feature page: "CXF DOSGi does not work 
with Karaf 2.3.0", however it does not mention whether this applies to 
2.3.1 as well (is it a DOSGi issue or Karaf issue that prevents it from 
working?), nor does it say anything about what the problem is, where it 
might be relevant, or link to any issue in JIRA - could someone in the 
know please update the text to be a bit more informative? Also, is this 
indeed just an issue with installation via the feature, or with 
compatibility between the projects/versions in general regardless of 
installation method?

3. After all the fixes above, the export success rate is definitely 
better than 50%, but still isn't 100% - every once in a while the export 
still fails. Looking through the logs, I found the message 
"RemoteServiceAdmin Implementation is shutting down now" - during Karaf 
startup! I'm still not sure what exactly is going on, but it appears 
that the RSA is started, services exported properly etc., but then it 
suddenly commits suicide a few moments later with no other error or 
indication of who/what initiated the shutdown or why. The initial theory 
is that the dsw Activator might be getting a ConfigurationAdmin updated 
event with null configuration, which would trigger a shutdown - but I 
haven't yet confirmed that this is the case, nor why it would happen or 
what the proper behavior in such a case should be.

I'm totally new to CXF and DOSGi (and pretty new to OSGi too), so please 
forgive any silly mistakes or assumptions above... but I do hope this 
will be useful in making it a better product and making it work better 
out of the box :-)

Thanks to anyone who read this far!


View raw message