Mailing-List: contact user-help@uima.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@uima.apache.org
Received-SPF: pass (athena.apache.org: domain of eaepstein@gmail.com
 designates 209.85.128.171 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <20140409023434.GP6156@machine.or.cz>
References: <20140409023434.GP6156@machine.or.cz>
Date: Wed, 9 Apr 2014 16:14:53 -0400
Message-ID: 
 <CAGRYgEMA619SVXcKYyyMt0JzqerX8cNucx6ODBi6rssZrOx30g@mail.gmail.com>
Subject: Re: Complex architectures with multiple CASes - how to?
From: Eddie Epstein <eaepstein@gmail.com>
To: user@uima.apache.org
Content-Type: multipart/alternative; boundary=089e013a04b6ec441804f6a1c0da

--089e013a04b6ec441804f6a1c0da
Content-Type: text/plain; charset=ISO-8859-1

One approach is for the cas multiplier to put the question View in each
search CAS, using the CasCopier, and create a second view for search result
processing. Down stream annotators would then be multi-view, getting
question analysis results from one view and doing search analysis in the
other.

Eddie


On Tue, Apr 8, 2014 at 10:34 PM, Petr Baudis <pasky@ucw.cz> wrote:

>   Hi!
>
>   I'd like to ask about the philosophy and typical usage patterns behind
> multiple CASes, CAS multipliers and CAS mergers.
>
>   I'm working on a simple question-answering system built on top of
> UIMA and mirroring DeepQA architecture.  Basically, on input I have
> a CAS with the input question as a sofa, and after some processing,
> a "search" CAS multiplier produces a CAS for each search result that
> might contain an answer.
>
>
>   However, at this point, I may want to use an AE that needs to see both
> the question CAS and the search result CAS. Typically, I could try to
> align sentences, i.e. with question sofa "Who invented the transistor?"
> and stand-off Focus annotation for "Who", I may want to search the
> result CAS for "(\S+) invented the transistor".
>
>   But now I'm stuck.  How can I build such an AE that has access to
> information in two CASes?  It seems one approach is to copy featuresets
> to result CAS in the multiplier.  However, if the CAS sofa is different,
> how can stand-off annotations (like Focus) be carried over?  Also, I may
> want to match parse trees instead of strings, which suddenly means
> potentially a lot of data is copied, and I will need to distinguish
> annotations of the question and of the searh result.  A similar problem,
> but in a much clumsier way, seems to arise if I were to make the
> alignment AE a CAS merger.
>
>
>   I must be missing something obvious here, but reading the developer guide
> back and forth doesn't help... Thanks for any hints!
>
>                                 Petr Baudis
>

--089e013a04b6ec441804f6a1c0da--