cxf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Kulp <>
Subject Re: JAX-WS client performances
Date Tue, 03 Feb 2015 20:02:23 GMT

> On Feb 3, 2015, at 8:11 AM, Alessio Soldano <> wrote:
> A brief update: I've committed the workaround for http connection issue (thanks Dan for
the help on that!) as well as some other straightforward optimizations on stuff that popped
up while profiling.
> Now, next "big" topic I found is the way we get and set properties in the message context.
We spend a relevant amount of time in creating HashMap instances and especially in MessageImpl#calcContextCache,
which copies (putAll) all the Bus, Service, Endpoint, etc. properties into the context cache.
You can see [1] the cpu hotspot view I get currently with the previously mentioned test app.
AFAICS in the source history, there used to be a different way for dealing with message properties
in the past [2], then the cache mechanism was added. So I'm wondering if some kind of profiling
/ perf testing have been performed in the past and led to the changes. I might simply be testing
an edge scenario, with very few properties being looked up and hence not justifying the caching
> Any comment / idea / suggestion?

At one point, every “get” of a property would end up checking 4 or 5 hash maps which resulted
in the keys being hashCoded many times, lots of checks, etc…    When you get into the WS-Security
cases and some of the HTTP configuration cases where there are a bunch of keys being looked
up, there was a LOT of time being spent on the lookups.   For the most part, at the time,
the maps were relatively small and the cost to build a single “context” map was small
in comparison which is why this was done.   

That said, the size of the cache map is likely fairly small as well.   Maybe a dozen keys?
 (and I'm willing to bet most of the keys are interned where a == would catch it)  Might be
simpler to just use an Object[] or something.


> Cheers
> Alessio
> [1]
> [2]
> On 27/01/15 18:14, Alessio Soldano wrote:
>> Hi,
>> my attention has been recently brought to a scenario in which an Apache CXF client
invokes an endpoint operation in a loop and the number of invocations performed in a given
amount of time (say, 2 minutes) is used as benchmark for measuring WS stack performances.
It's actually a very simplistic scenario, with a plain JAX-WS single thread client sending
and receiving small RPC/Lit SOAP messages [1]. The reason why I've been asked to have a look
is that with default settings the Apache CXF JAX-WS impl seems to perform *shamefully* bad
compared to the Metro (JAX-WS RI) implementation. I've been blaming the user log configuration,
etc but when I eventually tried on my own I could actually reproduce the bad results. I've
been profiling a bit and found few hot spot area where CXF could possibly be optimized, but
the big issue really seems to be at the HTTPCounduit / HTTPURLConnection level.
>> I found that almost all the invocations end up into
calling available() method [2] as part of the process for re-using cached connections [3];
that goes to the wire to try reading and takes a lot of time.
>> When the RI does the equivalent operation, the available() method is not called [4],
resulting in much better performances.
>> By looking at the JDK code, it looks to me that the problem boils down to
[5] returning different values, as a consequence of the fixedContentLenght attribute being
set to a value different from -1 when running on CXF only. As a matter of fact, that is set
when HTTPConduit.WrappedOutputStream#thresholdNotReached() is called, whenever a message is
completely written to the outpustream buffer before the chunking threshold is reached (at
least AFAIU). I've searched through the JAX-WS RI and could not find any place where setFixedLengthStreamingMode
is called on the connection instead.
>> So, I've performed two quick and dirty tries: the first time I forced allowChunking
= false on the client policy, the second time I commented out the code in HTTPConduit.WrappedOutputStream#thresholdNotReached().
In both cases I managed to get performances comparable to what I can get with the JAX-WS RI.
>> Now, few questions:
>> - are we really required to call setFixedLengthStreamingMode as we currently do?
what's the drawback of not calling it?
>> - should we actually do something for getting decent performances by default in this
scenario? (not sure expecting the user to disable chunking is that an option...)
>> As a side note, the relevant part of the JDK HttpClient code changed between JDK6
and JDK7, so things have not always been as explained above...
>> Cheers
>> Alessio
>> [1]
>> [2]
>> [3]
>> [4]
>> [5]
> -- 
> Alessio Soldano
> Web Service Lead, JBoss

Daniel Kulp -
Talend Community Coder -

View raw message