Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 823AB1038E for ; Fri, 23 Aug 2013 15:12:32 +0000 (UTC) Received: (qmail 69798 invoked by uid 500); 23 Aug 2013 15:12:32 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 69479 invoked by uid 500); 23 Aug 2013 15:12:26 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Received: (qmail 69461 invoked by uid 99); 23 Aug 2013 15:12:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Aug 2013 15:12:25 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of patni.harshal@gmail.com designates 74.125.82.53 as permitted sender) Received: from [74.125.82.53] (HELO mail-wg0-f53.google.com) (74.125.82.53) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Aug 2013 15:12:19 +0000 Received: by mail-wg0-f53.google.com with SMTP id c11so667463wgh.8 for ; Fri, 23 Aug 2013 08:11:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=Go831YEaTpZTanSslVPmdD373koJwrx8/R6t+2x2sWw=; b=n6k17FZmK0iO/g2TR5yHUVRnKYk6OHtU0d1CxciKb5Adj9oLzjdmGANnxeh3j8Q4Ap sEU951gY3QasM7ir+21KGxhZxgp3olKQsA/+jgHjFD7V27QYWWZoIiMr9dORk0ap8yO+ EFm0AmsjeuKpwxOZO4pJGlN7CQlUDuOX0G9VSochkSVDqlAgzGCvjGtxCCyPUR7qIf8y agzGfqgLxjWMAdNFr+ChpzSUvMmX0amN3uWx5QWveB+WUxFcVZ+7O1McUBt8/rzDkJb8 C7+HqeONJryO9fQsbHvOQkLskKKE9cXhom/KYmLSN9G2gIMKu3pUpj4SYBz06yNC+c4t HMgg== MIME-Version: 1.0 X-Received: by 10.180.188.132 with SMTP id ga4mr67223wic.53.1377270718397; Fri, 23 Aug 2013 08:11:58 -0700 (PDT) Received: by 10.216.171.70 with HTTP; Fri, 23 Aug 2013 08:11:58 -0700 (PDT) In-Reply-To: <520199E2.5020606@schor.com> References: <520199E2.5020606@schor.com> Date: Fri, 23 Aug 2013 20:41:58 +0530 Message-ID: Subject: Re: Processing a List of Strings with UIMA Addons components From: harshal patni To: "user@uima.apache.org" Content-Type: multipart/alternative; boundary=001a11c383f0ee046f04e49ed358 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c383f0ee046f04e49ed358 Content-Type: text/plain; charset=ISO-8859-1 Hello Marshall, Thank you for the suggestion! This works for us! As per your suggestion, we have now created an Aggregate Analysis Engine that contains CAS Multiplier (Splitter), our original aggregate engine and CAS Merger (to merge the results into one CAS at the end). But the final merged CAS contains the child CAS'es (created in the splitter) and the parent CAS as well. Is this expected? Any idea why? We used CAS splitter and merger for a synchronous UIMA pipeline as well. That does not give us the parent CAS in the final result (Merged CAS). Why the difference? Harshal On Wed, Aug 7, 2013 at 6:20 AM, Marshall Schor wrote: > > On 8/6/2013 6:10 PM, Mathaeus Dejori wrote: > > Hi, > > > > I'd like to use UIMA AS to annotate a large list of text segments. > Instead > > of passing each text segment individually to the AnalysisEngine I'd like > to > > pass the entire list at once. > > > > As far as I understand I can use the cas.setSofaDataArray() to pass a > list > > of Strings and get back Annotations that refer to particular segments. > > However, in doing so I won't be able to use any of the existing > Annotators > > (e.g. Concept Mapper) as their process(cas, spec) function expects the > > cas.getDocumentText(). > > > > Is there a design pattern for uima to consume a list of strings, pass > > individual elements to specific Annotators and combine all the results at > > the end? > If what you are trying to do is to take an input CAS which has a bunch of > "strings" and send each one thru a pipeline, the normal UIMA design > pattern for > that is to use a CAS Multiplier at the start which gets as input the CAS > with > all the strings, and then puts each one into another CAS and send it > through the > pipeline. If the combining you want to do is to combine all the results > into > another CAS, then you can use another CAS Multiplier at the end which > receives > the individual string CASes, and accumulates results until all the parts > are > done, and then outputs a "result" CAS with the combined result. > > See > http://uima.apache.org/d/uimaj-2.4.1/tutorials_and_users_guides.html#ugr.tug.cm > > -Marshall > --001a11c383f0ee046f04e49ed358--