uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timo Boehme <timo.boe...@ontochem.com>
Subject Re: Parallel CAS consumer
Date Wed, 10 Oct 2012 14:30:30 GMT

thank you very much for all the feedback.

Am 10.10.2012 15:24, schrieb Eddie Epstein:
> ...
> Another approach with all in the same process, also using UIMA-AS,
> would be to use a CAS multiplier to replicate a CAS and have the
> flow controller send a copy to each of the two delegates, in parallel.
> The results could then be merged back into one CAS with another
> CAS multiplier. Both CM and the two delegate could be implemented
> in an aggregate, so that the child CASes would only exist in the
> aggregate and all results would be returned in the original input CAS.

This is exactly the solution I was thinking about. However I haven't 
used UIMA-AS so far since we do not split processing over multiple 
machines but use a multi-core server. Thus I don't want to set up a 
resource consuming service infrastructure with slow socket communication 
but would like to run it in the same Java VM.
Now it's not clear to me if this is nevertheless possible with UIMA-AS 
("... in the same process")?


> On Wed, Oct 10, 2012 at 8:56 AM, Jens Grivolla <j+asf@grivolla.net> wrote:
>> Hi all,
>> from what I understand this does not involve CAS multipliers at all, but
>> simply a flow where all CAS consumers are done in one "parallel step".
>> Apparently this can't be done in a CPE so you would need an aggregate of all
>> the CAS consumers, and have a parallel flow controller for that aggregate.
>> However, that wouldn't really do any good according to the documentation:
>> "ParallelStep, which specifies that multiple Analysis Engines should receive
>> the CAS next, and that the relative order in which these Analysis Engines
>> execute does not matter. Logically, they can run in parallel. The runtime is
>> not obligated to actually execute them in parallel, however, and the current
>> implementation will execute them serially in an arbitrary order."
>> Best,
>> Jens
>> On 10/10/2012 12:39 PM, Richard Eckart de Castilho wrote:
>>> Hi,
>>> I see. I think this is not possible. To my knowledge CPE (which you
>>> probably use) does not support CAS multipliers. I'm not too familiar with
>>> UIMA-AS, are you sure that it supports such a scenario?
>>> If you manage to get realize the scenario as you described, it would be
>>> great to hear how you did it.
>>> Best,
>>> -- Richard
>>> Am 10.10.2012 um 12:15 schrieb Timo Boehme <timo.boehme@ontochem.com>
>>> :
>>>> Hi,
>>>> Am 10.10.2012 12:05, schrieb Richard Eckart de Castilho:
>>>>> the main difference between CAS consumers and analysis engines is
>>>>> that the former be default run only a single instance and the latter
>>>>> can be multiplied. If your consumer code can be run in parallel, just
>>>>> try inheriting from AnalysisEngine_ImplBase (or something like that)
>>>>> instead.
>>>> Thanks for your answer. However each single consumer must run as single
>>>> instance (e.g. one database consumer, one consumer writing to a file; each
>>>> of them need to run as single instance). Thus I would like to have a single
>>>> instance per consumer but the different consumer to run in parallel.
>>>> Kind regards,
>>>> Timo
>>>>> Am 10.10.2012 um 12:00 schrieb Timo Boehme <timo.boehme@ontochem.com>
>>>>> :
>>>>>> Hi,
>>>>>> is there any possibility without using UIMA-AS to run different CAS
>>>>>> consumer components of a pipeline in parallel?
>>>>>> The standard behavior is that the consumer are called in sequence,
>>>>>> since in my case they don't depend on each other it would be more
>>>>>> to have them run in parallel. Can I use CAS multiplier + Flow control
>>>>>> achieve this?


  Timo Boehme
  OntoChem GmbH
  H.-Damerow-Str. 4
  06120 Halle/Saale
  T: +49 345 4780474
  F: +49 345 4780471


  OntoChem GmbH
  Geschäftsführer: Dr. Lutz Weber
  Sitz: Halle / Saale
  Registergericht: Stendal
  Registernummer: HRB 215461

View raw message