uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zesch, Torsten" <torsten.ze...@uni-due.de>
Subject Re: Error when trying to drop CAS with FlowController
Date Sun, 06 Sep 2015 19:33:11 GMT
Thanks for your input.

To give some more information about our use case:
Our input is a mix of documents.
Only some of them are relevant and should be written by the consumer.
We also thought about the solution with a special FeatureStructure, but
this has the disadvantage that the consumer needs to be aware of that.
It would be easier if some CASes could simply be dropped.
I guess this could even be useful for flat workflows.

-Torsten


Am 06/09/15 17:31 schrieb "Eddie Epstein" unter <eaepstein@gmail.com>:

>Keeping the filter inside the INNER may still be useful to
>terminate any further processing in that AAE.
>
>outputsNewCases=true is just saying that an aggregate is
>a CasMultiplier and *might* return child-CASes. It doesn't
>change the CAS-in/CAS-out contract for the component.
>
>I think a fair amount of logic would have to be reworked
>if that contract were changed. For sure in UIMA-AS,
>where supporting CM services is one of the more complex
>design issues. But maybe it would be interesting to see
>the pros vs cons of making that change.
>
>Eddie
>
>
>On Sun, Sep 6, 2015 at 11:20 AM, Richard Eckart de Castilho
><rec@apache.org>
>wrote:
>
>> That would require that the OUTER_AAE is aware of the filtering.
>> We would prefer if all customization/filtering/etc. could be done in the
>> INNER_AAE which is the declared extension point.
>>
>> In the worst case, we'd probably opt to move the FILTER from to
>> the OUTER_AAE entirely and make filtering a default option.
>>
>> My assumption would be that the OUTER_AAE should not have a problem
>> if the INNER_AAE drops anything if INNER_AAE declares
>>outputsNewCases=true.
>> But obviously that assumption is wrong - I/we just don't get why.
>>
>> Cheers,
>>
>> -- Richard
>>
>> On 06.09.2015, at 17:14, Eddie Epstein <eaepstein@gmail.com> wrote:
>>
>> > How about the filter adds a FeatureStructure indicating that the CAS
>> should
>> > be dropped.
>> > Then when the INNER_AAE returns the CAS, the flow controller in the
>> > OUTER_AAE
>> > sends the CAS to FinalStep?
>> >
>> > Eddie
>> >
>> > On Sun, Sep 6, 2015 at 11:08 AM, Richard Eckart de Castilho <
>> rec@apache.org>
>> > wrote:
>> >
>> >> Eddie,
>> >>
>> >> we (Torsten and I) have the case that a reader produces a number of
>> CASes
>> >> and we want to filter out some of them because they do not match a
>>given
>> >> criteria.
>> >>
>> >> The pipeline/flow structure we are using looks as follows:
>> >>
>> >> READER -> OUTER_AAE { AEs..., INNER_AAE { FILTER }, AEs..., CONSUMER
>>}
>> >>
>> >> READER, OUTER_AAE, AEs and CONSUMER are assumed to be fixed.
>> >>
>> >> INNER_AAE is meant to be an extension point and the FILTER inside it
>> >> is meant to remove all CASes that do not match our criteria such
>> >> that those do not reach the CONSUMER.
>> >>
>> >> So we do explicitly not want certain CASes to continue the processing
>> path.
>> >>
>> >> -- Richard
>> >>
>> >> On 06.09.2015, at 17:04, Eddie Epstein <eaepstein@gmail.com> wrote:
>> >>
>> >>> Richard,
>> >>>
>> >>> In general the input CAS must continue down some processing path.
>> >>> Where is it stored and what triggers its continued processing if it
>>is
>> >> not
>> >>> returned?
>> >>>
>> >>> Eddie
>> >>>
>> >>> On Sun, Sep 6, 2015 at 10:28 AM, Richard Eckart de Castilho <
>> >> rec@apache.org>
>> >>> wrote:
>> >>>
>> >>>> Hi Eddie,
>> >>>>
>> >>>> in most cases, we use process(CAS) and in such a case what you
>> describe
>> >>>> is very logical.
>> >>>>
>> >>>> However, when setting outputsNewCases to true, doesn't the contract
>> >> change?
>> >>>> My understanding is that processAndOutputNewCASes(CAS) is being
>> >>>> used and in such a case. Why shouldn't it be ok that the iterator
>> >>>> returned by processAndOutputNewCASes does not contain the input
>>CAS?
>> >>>>
>> >>>> Cheers,
>> >>>>
>> >>>> -- Richard
>> >>>>
>> >>>> On 06.09.2015, at 16:21, Eddie Epstein <eaepstein@gmail.com>
wrote:
>> >>>>
>> >>>>> Hi Richard,
>> >>>>>
>> >>>>> FinalStep() in a CasMultiplier aggregate means to stop further
>>flow
>> >>>>> in the aggregate and return the CAS to the component that passed
>> >>>>> the CAS into the aggregate, or if a child-CAS, passed the child's
>> >>>>> parent-CAS into the aggregate.
>> >>>>>
>> >>>>> FinalStep(true) is used to stop a child-CAS from being returned
>> >>>>> to the component. But the contract for an AE is CAS-in/CAS-out,
>> >>>>> which means a CAS coming into an AE must be returned.
>> >>>>>
>> >>>>> Eddie
>> >>>>>
>> >>>>> On Sun, Sep 6, 2015 at 9:59 AM, Richard Eckart de Castilho <
>> >>>> rec@apache.org>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> Hi Eddie,
>> >>>>>>
>> >>>>>> ok, but why can input CASes created outside the aggregate
not be
>> >>>> dropped?
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>>
>> >>>>>> -- Richard
>> >>>>
>> >>>>
>> >>
>> >>
>>
>>


Mime
View raw message