uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eddie Epstein <eaepst...@gmail.com>
Subject Re: DUCC and CAS Consumers
Date Wed, 11 Apr 2018 13:47:30 GMT
Hi Erik,

DUCC jobs can scale out user's components in two ways, horizontally by
running multiple processes (process_deployments_max)  and vertically by
running the pipeline defined by the CM, AE and CC components in multiple
threads (process_pipeline_count).  Since the constructed top AAE is
designed to run in multiple threads, it requires multiple deployments
enabled for all pipeline components.

The CM and CC components are optional as they could be already included in
the specified process_descriptor_AE. The reason for explicitly specifying
CM and CC components is to facilitate high scale out. The Job's collection
reader should create CASes with references to data which will often be
segmented by the CM into a collection of CASes to be processed by the users
AE. The initial CAS created by the driver normally does not flow into the
AE, but typically does flow to the CC after all child CASes from the CM
have been processed to trigger the CC to finalize the collection.

More information about the job model is described in the duccbook at
https://uima.apache.org/d/uima-ducc-2.2.2/duccbook.html#x1-181000III

Regards,
Eddie


On Wed, Apr 11, 2018 at 5:16 AM, Erik Fäßler <erik.faessler@uni-jena.de>
wrote:

> Hi all,
>
> I am doing my first steps with UIMA DUCC. I stumbled across the issue that
> my CAS consumer has allowMultipleDeployments=false since it is supposed to
> write multiple CAS document texts into one large ZIP file.
> DUCC complains about the discrepancy of the processing AAE being allowed
> for multiple deployment but one of its containers (my consumer) is not.
> I did specify the consumer with the "process_descriptor_CC” job file key
> and was assuming that DUCC would take care of it. After all, it is a key of
> its own. But it seems the consumer is just wrapped into a new AAE together
> with my annotator AAE. This new top AAE created by DUCC causes the error:
> My own AAE is allowed for multiple deployment and so are its delegates. But
> the consumer not, of course.
>
> How to handle this case? The documentation of DUCC is rather vague at this
> point. There is the section about CAS consumer changes but it doesn’t
> mention multiple deployment explicitly.
>
> What is the “process_descriptor_CC” for when it get wrapped up into an AAE
> with the user-delivered AAE anyway?
>
> Thanks and best regards,
>
> Erik
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message