I had previously described that when I used XmiCasSerializer with many (10) concurrent AnalysisEngines,
my throughput dropped to about half, and wasn't scaling up.
I did some profiling of my code using JProbe, and I think I've found the problem.
I discovered that my application spent 64% of its elapsed time in XmiCasSerializer and it's
child methods. Within that, one method rose to the top: 72% of elapsed time was spent in
TypeSystemImpl.ll_isValidTypeCode(). In fact, this exceeded the time spent in XmiCasSerializer
(114%).
This in turn was almost all in SymbolTable.getSymbol(). This was called over 17 million times
in my application, which spent 72% of its elapsed time in this one method. 99.9% of its time
was spent in itself, and not it's children (Vector.get(int) was the highest child, at 0.1%).
I'm not exactly sure why this method takes so long. I suspect it's a concurrency issue.
I see a synchronized block in the set() method, so that would be something to look into.
Given that some of my AnalysisEngines may be inserting annotations while others are executing
XmiCasSerializer, I can see potential for conflict.
Hopefully, these clues will be enough for someone familiar with the code to figure it out.
Greg Holmberg
|