uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eddie Epstein <eaepst...@gmail.com>
Subject Re: "Run as AS aggregate" and pre-fetching
Date Mon, 21 Oct 2013 12:57:43 GMT
In a core UIMA aggregate engine all annotators run in a single thread, and
the code length moving from one annotator to another is "small". When
deployed asynchronously, each annotator in a different thread, the code
length is much higher and there is thread switching overhead as well.

In my experience there are two generally successful approaches to deploying
UIMA-AS multithreaded.
The simplest is to keep the entire pipeline synchronous and deploy N
pipeline instances, each running in its own thread; this design is good for
high throughput.

The second approach deploys only the top level aggregate (and carefully
selected 2nd or 3rd level aggregates) with the idea that operations can
proceed in parallel and slower components be replicated; this design is
good for low latency. Note that asynchronous components can only operate in
parallel if they are working on different CASes, so the use of CAS
Multipliers each with a pool of CASes is needed.

It is best to keep aggregates synchronous unless there is a useful reason
not to.


On Sat, Oct 19, 2013 at 1:25 PM, John David Osborne (Campus) <ozborn@uab.edu
> wrote:

> What are the consequences of selecting in a UIMA-AS deployment descriptor
> "Run as AS aggregate"?
> I found an email from a year ago online where Eddie Epstein wrote:
> "UIMA-AS will put every asynchronous component in a separate thread.Using
> the ComponentDescriptorEditor on a UIMA-AS deployment
> descriptor, marking an aggregate with "Run as AS aggregate" will make
> every delegate in *that* aggregate an asynchronous component."
> I have a deployment with 32 aggregate analysis engines but I have not
> checked the box "Run as AS aggregate" in the deployment descriptor. Should
> I generally be doing this for all aggregate analysis engines? I'm not sure
> I understand the tradeoff very well, it sounds like I could get some
> performance improvements by checking this box since everything could run
> asynchronously however it sounds like if my pipeline isn't really ready to
> be run asynchronously some things may break..
> Did I get that right?
> Also I noticed the Eclipse component editor for UIMA-AS deployment
> descriptor doesn't provide the option to set pre-fetching (you can't see
> it either).
>  -John
> --
> John David Osborne
> Research Associate
> University of Alabama at Birmingham
> Biomedical Informatics
> Center for Clinical and Translational Science
> 1720 7th Avenue South
> Sparks Building, Suite 175
> Birmingham, AL, 35294
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message