uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From holmberg2066@comcast.net (g...@holmberg.name)
Subject Why XmiCasSerializer is slow
Date Wed, 11 Jul 2007 01:44:05 GMT
I had previously described that when I used XmiCasSerializer with many (10) concurrent AnalysisEngines,
my throughput dropped to about half, and wasn't scaling up.

I did some profiling of my code using JProbe, and I think I've found the problem.

I discovered that my application spent 64% of its elapsed time in XmiCasSerializer and it's
child methods.  Within that, one method rose to the top: 72% of elapsed time was spent in
TypeSystemImpl.ll_isValidTypeCode().  In fact, this exceeded the time spent in XmiCasSerializer

This in turn was almost all in SymbolTable.getSymbol().  This was called over 17 million times
in my application, which spent 72% of its elapsed time in this one method.  99.9% of its time
was spent in itself, and not it's children (Vector.get(int) was the highest child, at 0.1%).

I'm not exactly sure why this method takes so long.  I suspect it's a concurrency issue. 
I see a synchronized block in the set() method, so that would be something to look into. 
Given that some of my AnalysisEngines may be inserting annotations while others are executing
XmiCasSerializer, I can see potential for conflict.

Hopefully, these clues will be enough for someone familiar with the code to figure it out.

Greg Holmberg

View raw message