cxf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Schneider <ch...@die-schneider.net>
Subject Re: Fwd: DOSGi - service not exported consistently
Date Fri, 22 Mar 2013 08:31:07 GMT
On 20.03.2013 19:25, A. Rothman wrote:
>
> [The message body disappeared for some reason, resending...]
>
>
> Hi,
>
> While trying to use DOSGi (1.4.0, in Karaf 2.3.1) I came across some 
> serious issues, namely when starting up a fresh Karaf instance 
> repeatedly, only about 50% of the time the service actually got 
> exported (as verified with a browser pointing to the appropriate WSDL 
> URL). This is just on a single instance on a local host, before trying 
> to do anything remotely. I started investigating into the matter, 
> which required quite a lot of debugging and code reviewing, and found 
> various issues with the (strangely named?) TopologyManagerExport class 
> which seems to be at the heart of the matter. This took several 
> iterations and includes also non-critical cleanups I did along the way 
> - I'm attaching the final version I arrived at which hopefully 
> addresses the issues, though due to the law of clustering of bugs, I 
> wouldn't be surprised if there are other new or old bugs still lurking 
> there... any feedback and additional code reviews would be greatly 
> appreciated.
>
> Here is a summary of all the changes, big and small:
>
> - fixed typos in docs and logs.
> - fixed formatting to be more consistent and readable.
> - added debug logs to 
> RemoteServiceAdminLifeCycleListener.added/removed callbacks (this 
> proved the serious tracker bug below, but is also useful for general 
> debugging).
> - simplified shouldExportService returned boolean expression.
> - replaced iterations on map keys followed by a get of the 
> corresponding value with an iteration on Map.Entry (prevents 
> unnecessary lookups).
> - replaced map containsKey checks followed by get with a get followed 
> by null check (prevents unnecessary lookups, and theoretically race 
> conditions too).
> - fixed tracker misuse: when tracker callback (added) is called, the 
> tracker itself does not yet reflect the new state, so accessing 
> services (or size) on the tracker from within the callback will fail 
> to include the new service. Although the actual service export 
> triggered by the callback is performed on a separate thread, this 
> still leaves a race condition between the two which sometimes causes 
> the export to fail (with "No RemoteServiceAdmin available!" log). To 
> fix this, the newly added RSA is passed down the method chain to the 
> doExportService method explicitly. doExportService then uses the 
> reference if it is given, or if null (as when called from elsewhere), 
> uses the tracker to get the service list as it did in the past.
> - fixed exportService always overwriting a service's entry in 
> exportedServices with a new empty data structure - now it only does so 
> if it does not already exist.
> - fixed exportedServices inner maps to be created as synchronized maps 
> - previously doExportService wrapped it in a synchronized map on each 
> invocation, which results effectively in no synchronization, since the 
> synchronized wrapper map is local only so each thread accessing the 
> map uses a different lock. Also fixed other accesses of the inner maps 
> to be synchronized (Collections.synchronizedMap requires external 
> synchronization on it when being iterated, as specified in the docs).
> - fixed exports null check in doExportService (in the old code, 
> Collections.synchronizedMap never returns null; In the new code it 
> makes more sense.)
> - removed redundant remoteServiceAdminTracker null check in 
> doExportService (it is already used in the constructor, so can't be 
> null at this point).
> - added cast of null to String when calling getServiceReferences in 
> exportExistingServices, so that it will compile also with the OSGi 
> 4.3.0 and later API (which added an overloaded method accepting a 
> Class). Does not affect previous versions.
> - removed removeExportRegistration, removeExportReference and 
> remoteAdminEvent methods, all of which are both unused and have a 
> no-op implementation.
>
> Feel free to pick out the changes u like...
>
> In addition, I had a few related questions/thoughts, I hope someone 
> here can help with them:
>
> 1. Is DOSGi being updated to work with OSGi 4.3.0/Felix 4.x/Karaf 
> 2.3.x etc.? It seems to be slightly outdated. At a glance, it looks 
> like making it compile against OSGi 4.3.0 requires just a few 
> backward-compatible syntax changes, and hopefully the semantics have 
> not changed too much.
>
DOSGi should work with all newer versions of Felix and Karaf. The OSGI 
4.3.0 spec should be compatible to 4.2.0. So this should also work.
> 2. The website says on the Karaf feature page: "CXF DOSGi does not 
> work with Karaf 2.3.0", however it does not mention whether this 
> applies to 2.3.1 as well (is it a DOSGi issue or Karaf issue that 
> prevents it from working?), nor does it say anything about what the 
> problem is, where it might be relevant, or link to any issue in JIRA - 
> could someone in the know please update the text to be a bit more 
> informative? Also, is this indeed just an issue with installation via 
> the feature, or with compatibility between the projects/versions in 
> general regardless of installation method?
>
The website says "The default aegis data format will not work with 
Apache Karaf 2.3.0. (See DOSGI-153 
<https://issues.apache.org/jira/browse/DOSGI-153>). Please upgrade to 
Apache Karaf 2.3.1".

> 3. After all the fixes above, the export success rate is definitely 
> better than 50%, but still isn't 100% - every once in a while the 
> export still fails. Looking through the logs, I found the message 
> "RemoteServiceAdmin Implementation is shutting down now" - during 
> Karaf startup! I'm still not sure what exactly is going on, but it 
> appears that the RSA is started, services exported properly etc., but 
> then it suddenly commits suicide a few moments later with no other 
> error or indication of who/what initiated the shutdown or why. The 
> initial theory is that the dsw Activator might be getting a 
> ConfigurationAdmin updated event with null configuration, which would 
> trigger a shutdown - but I haven't yet confirmed that this is the 
> case, nor why it would happen or what the proper behavior in such a 
> case should be.
I have not yet seen this problem. If you can find a way to reproduce it 
I would be very interested in your findings.
>
> I'm totally new to CXF and DOSGi (and pretty new to OSGi too), so 
> please forgive any silly mistakes or assumptions above... but I do 
> hope this will be useful in making it a better product and making it 
> work better out of the box :-)

Many thanks for your efforts already.  It will take a while to work 
through your changes and test them. I will take a look as soon as they 
are in jira so we can properly track them. For the future it would be 
great if you could separate changes into smaller chunks and create 
individual jira issues for each. This way it is easier to work through 
the changes ... and it also gives you more visibility if you want to 
become a committer ;-)

Christian

-- 
Christian Schneider
http://www.liquid-reality.de

Open Source Architect
http://www.talend.com


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message