cxf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benson Margulies" <>
Subject JProfile Results on CXF startup
Date Thu, 13 Mar 2008 07:05:59 GMT
I thought it worthwhile to start a new thread with a clear subject line on
my efforts to characterize startup performance.

I obtained for us a group license to JProfile, and I proceeded to apply it
to several small programs that created client or server endpoints.

I could upload JProfile snapshots or HTML call trees (those are really big)
to someplace, like my home dir, or even confluence, if
someone wants to check up on me, and I could add the little programs that I
have sitting in the systest subproject to svn.

One caveat: I measured in our development environment., not against a
snapshot. The most conspicuous result is that the classloader is operating
against a medium-long list of directories of classes instead of the same
sized, or even a much shorter, list of JAR files. This amplifies some

The most conspicuous result is that Spring+Xerces sure eats up a lot of
time. For either a client or a server endpoint, the cost of creating the
default Bus via Spring is much larger than anything else we do. More time
goes into Xerces than into Spring itself. This leads me to contemplate a
STaX bean loader class, which it should be possible to bolt onto Spring
through their published API.

Spring+Xerces is the only answer to the question, 'why does it take so long
to initialize *a single* endpoint application?'

Then I moved on to tests that loop creating an endpoint without incurring
any additional Spring+Xerces overhead.

The next thing that turned up, not entirely surprisingly, was the JAXB RI.
Not much we can do about that, though I discovered that an optimization that
I added last night had the effect of lowering the cost of JAXB startup. I
confess that I did not explain why except to observe that JAXB did less
classloading when I used a cache to speed up JAX-WS's location of wrapper
classes. Since all the tests pass, I'm trusting that I didn't change any

Looking for time not (obviously) connected to JAXB, my next sad report
concerns XmlSchema. Consider a simple JAX-WS+JAXB client endpoint with a
WSDL. It reads and parses the WSDL, and builds the service model. Building
the service model did not appear as a significant time sink. Parsing the
WSDL file did ... because XML schema's namespace resolution mechanism turns
out to be a significant hotspot. I haven't looked in detail at their code
since it's cumbersome to set up for the purpose from Sweden. And I think
that we all know that that we have some frustration in seeing changes move
through the XmlSchema process to a release.

Caching the parsed form of a WSDL seems a plausible thing to do, except that
I'm personally quite aware that the process of building the service model
includes filling gaps that turn up in the schema. Also, in theory, the WSDL
could *change* from one Endpoint creation to the next, could it not? We
could make a conscious decision to ignore that possibility. Of course,
XmlSchema itself has no cloning concept. We could move the 'gap-filling'
code from the service factory where it lives now to the WSDL schema-getter
so that it happens once and for all, and then treat the schema as read-only.

Finally, let me mention the one thing that I did something about. In the
process of assembling an endpoint, the code makes repeated calls to
getResponseWrapper and getRequestWrapper for JAXWS. Each of these calls in
turn does some expensive reflection and class-loading. I introduced a little
cache in the JaxWsServiceConfiguration, and got the time back. This, of
course, means that the simple case is not a trifle slower to the tune of
creating two HashMaps and making a few probes. However, since even a single
endpoint creation ends up calling these more than once per operation, it's
at worst, as far as I can tell, a wash. In a loop creating 100 Endpoints,
it's a worthwhile improvement.

I think it would be a good thing if some others would join the discussion
here about what, if anything, to do next. At one level, we could focus on
making performance tests part of the standard (or an optional) build, to
ensure that we don't code big performance regressions. Or we could aim at
some of the issues discussed above. Or I could make more measurements.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message