uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roberto Franchini <ro.franch...@gmail.com>
Subject What's the right Cas Pool size?
Date Mon, 26 Jan 2009 23:46:37 GMT
hi,
I'm running collection processing on a quad core with 8G of RAM.
I run the CPE with 4 thread and I tried various sizes for the cas
pool. I patched the CPE so now i can set cas pool size bigger than 60,
but the result is always the same:
The CAS processor pool is empty. (Thread Name: [Procesing Pipeline#4
Thread]::) Total size: 1 Free in pool: 0

This happens with 60, 100, or even 240!
Actually this pipeline is able to analyze 60k document per hour. It's
good, but I hope to reach 100k docs/h. The old one (not UIMA-based)
did 24000 docs/h single thread and I able to run 3 pipeline in
parallel (three processors).
This new pipeline does more work, and I'm able to run 2 pipelines on
different processes to achieve a 90k docs/h. To run 2 pipelines I
should limit treads to 2.
The major limitig factor is the creation of a lot of temporary
objects, so this is the jvm configuration to mitigate this:
-Xmx3072M  -XX:NewSize=1024M \
                          -XX:ParallelGCThreads=4
-XX:+UseParallelOldGC -XX:-UseGCOverheadLimit  \
                         -XX:+DisableExplicitGC  -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps"

I wonder if there's a right cas pool size to increase the analysis speed.
If necessary I can modify queue sizes (work queue, consumer queue).
Any suggestion?
Roberto
-- 
Roberto Franchini
http://www.celi.it
http://www.blogmeter.it
http://www.memesphere.it
Tel +39-011-6600814
jabber:ro.franchini@gmail.com skype:ro.franchini

Mime
View raw message