uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eddie Epstein <eaepst...@gmail.com>
Subject Re: UIMAj3 ideas
Date Fri, 10 Jul 2015 14:28:08 GMT
Hi Petr,

Good comments which will likely generate lots of responses.
For now please see comments on scaleout below.

On Thu, Jul 9, 2015 at 6:52 PM, Petr Baudis <pasky@ucw.cz> wrote:

>   * UIMAfit is not part of core UIMA and UIMA-AS is not part of core
>     UIMA.  It seems to me that UIMA-AS is doing things a bit differently
>     than what the original UIMA idea of doing scaleout was.  The two
>     things don't play well together.  I'd love a way to easily take
>     my plain UIMA pipeline and scale it out, ideally without any code
>     changes, *and* avoid the terrible XML config files.
>
>
Not clear what you are referring to as the "original UIMA idea of doing
scaleout",
the CPE? Core UIMA is a single threaded, embeddable framework. UIMA-AS
is also an embeddable framework that offers flexible vertical
(multi-threading) and
horizontal (multi-process) options for deploying an arbitrary pipeline.
Admittedly
scaleout with UIMA-AS is complicated and the minimal support for process
management make it difficult to do scaleout simply. In what ways do you
think
UIMA-AS is inconsistent with UIMA or UIMA scaleout?

DUCC is full cluster management application that will scaleout a plain UIMA
pipeline with no code changes, assuming that the application code is
threadsafe.
But a typical pipeline with a single collection reader creating input CASes
and
a single cas consumer will limit scaleout performance pretty quickly. DUCC
makes it easyto eliminate the input data bottleneck. DUCC sample apps
show one approach to eliminating the output bottleneck. Have you looked at
DUCC?

Regards,
Eddie

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message