uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eddie Epstein" <eaepst...@gmail.com>
Subject Re: CPE to AS Transition ... Porting processingUnitThreadCount
Date Thu, 25 Sep 2008 13:27:18 GMT
In order to optimize deployment, it is good to focus on where the work is
being done and then what overhead is added by the framework.

In your case all the work for components CR, A, C and D is expected to be
done on Machine 0. Separating the CR from the aggregate adds unnecessary CAS
serialization overhead for every document, so it would be better to move the
CR into the aggregate. Components A, C or D can be replicated as needed
(using numInstances as appropriate for each) in the one aggregate instance.

Machines 1..N are then used to scaleout multiple instances of B.

RunRemoteAsyncAE could just send an "empty" CAS to kick off the CR in the
aggregate, or the CAS could contain information about the collection to be
processed by the CR.

Note that RunRemoteAsyncAE is a fairly simple application, and it is the
UIMA AS async API that optionally deploys colocated services and/or
optionally instantiates a CR. My point is that RunRemoteAsyncAE could be
replaced with a custom application that (via unspecified mechanisms) deploys
B on remote machines, then deploys the aggregate in the same JVM, runs it,
and shuts everything down at the end.


On Thu, Sep 25, 2008 at 6:54 AM, Charles Proefrock <chas.pro@hotmail.com>wrote:

> I've reviewed Fig. 4 and Fig. 3.  Our system seems closer to Fig. 3
> (asingle Collection Reader (CR) with CasPool size X used to push documents
> to X services).Assuming the "Service Instance" is an aggregate (AG) with
> multiple AEsteps A..D, we are extending Fig. 3 with another level of remote
> AE forone of the steps: Machine0:  Broker + RunRemoteAsyncAE + 2 AG Service
> InstancesMachine1:  RemoteStepB_AE InstanceMachine2:  RemoteStepB_AE
> Instance The AG descriptor is configured with A..D in-line, and the AG
> deploymentdescriptor has a remote 'B' override, possibly with error
> handlingcontrols, etc. CR  --- || --2-- A                              B ---
> || --2-- remote 'B'                               C
>       D (consumer) If I'm following your guidance, we should not use
> numInstances in the AGdeployment descriptor because we have decided to
> remote 'B'. Instead weneed to deploy the 2 AG Service Instances via our own
> launch mechanism(as either multiple -d flags on RunRemoteAsyncAE, or
> independently intheir own processes). Let me know if I'm on track.
> - Charles

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message