uima-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Another interesting potential speedup
Date Wed, 01 Oct 2008 19:21:12 GMT
Profiling certainly shows unusual places you'd never think to look :-)

This may be a bit of an anomaly - but we have a scaleout test for
uima-as, sending large numbers of CASes over the wire (but the test is
running in multiple JVMs on one machine - so there's no network
delays).  We're running this with essentially empty CASes - just to see
where other overhead is.

We expected that things like deserialization would not show up - because
the CASes were empty.  However, deserialization was the biggest time
consumer.  Looking into this, it turns out that (in our particular case)
90% of the time in deserialization was due to creating a new XML Reader
(the call: XMLReaderFactory.createXMLReader.  A quick search on the
internet turned up this link:
http://www.ibm.com/developerworks/xml/library/x-perfap2.html which
suggested this could indeed be a bottleneck, which could be avoided by
reusing the same XMLReader object, instead of throwing it away and
getting a new one on every call.

This would take some work (pooling, etc.) to make things thread-safe,
but might be a good thing to do -- unless small but non-empty CASes turn
out to bottleneck in some other way that swamps this measurement.

This only applies to transports that use XML-style of
serialization/deserialization, of course.


View raw message