Return-Path: Delivered-To: apmail-incubator-uima-user-archive@locus.apache.org Received: (qmail 60747 invoked from network); 29 Aug 2007 18:28:05 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 29 Aug 2007 18:28:05 -0000 Received: (qmail 99625 invoked by uid 500); 29 Aug 2007 18:28:00 -0000 Delivered-To: apmail-incubator-uima-user-archive@incubator.apache.org Received: (qmail 99609 invoked by uid 500); 29 Aug 2007 18:28:00 -0000 Mailing-List: contact uima-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-user@incubator.apache.org Delivered-To: mailing list uima-user@incubator.apache.org Received: (qmail 99600 invoked by uid 99); 29 Aug 2007 18:28:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2007 11:28:00 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [141.142.2.77] (HELO rimantadine.ncsa.uiuc.edu) (141.142.2.77) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2007 18:28:53 +0000 Received: from candy.ncsa.uiuc.edu (candy.ncsa.uiuc.edu [141.142.220.49]) by rimantadine.ncsa.uiuc.edu (8.14.0/8.14.0) with ESMTP id l7TIRVeP018891 for ; Wed, 29 Aug 2007 13:27:31 -0500 Message-Id: <7.0.0.16.2.20070829130420.04b66758@ncsa.uiuc.edu> X-Mailer: QUALCOMM Windows Eudora Version 7.0.0.16 Date: Wed, 29 Aug 2007 13:27:31 -0500 To: uima-user@incubator.apache.org From: Andrew Shirk Subject: RE: read/write resource sharing In-Reply-To: <31923CD68FF05B42B1D3AC08F2D733B502A29970@nybf01-mail01.ad. gd-ais.com> References: <7.0.0.16.2.20070829102140.04b5dc60@ncsa.uiuc.edu> <31923CD68FF05B42B1D3AC08F2D733B502A29970@nybf01-mail01.ad.gd-ais.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Null-Tag: d9ee4f002785140983b4550a5877c45b X-NCSA-MailScanner-Information: Please contact help@ncsa.uiuc.edu for more information, rimantadine.ncsa.uiuc.edu X-NCSA-MailScanner: Found to be clean X-Virus-Checked: Checked by ClamAV on apache.org Hi Frank, At 11:29 AM 8/29/2007, you wrote: >Is there a reason that the configuration parameters that are being read >by the last annotator can't be an annotation on the given document. They have nothing to do with the document per se. >For >example, one can imagine a pipeline where there is an Analysis Engine >that checks the language of the document and then a separate >morphological tokenizer creates morpheme annotations using this >information. The natural way to do now would be to set a new >DocumentAnnotation on the document with the appropriate language and >have the tokenizer AE read this. Yes, in that case, the CAS would be used pretty much as it was intended. I'm stumbling on the conceptual mismatch between my configuration variables, and an "annotation." > Does the analysis engine you are >dealing with wrap something that requires an actual configuration file >or can it take for example a string argument instead? Yes, I'm creating an annotator that wraps a legacy process flow execution system. The execution system ingests a process flow description (a graph of work nodes) in XML, and then executes it. Right now, I have the path to the flow description file specified in an analysis engine parameter, which has the obvious downside of requiring a user of the annotator to edit the engine descriptor whenever they want to change the process flow that will be executed. I thought that using a DataResource would allow me to store the path in an external file, which could be easily edited by hand by a user, and then read in by the DataResource implementation. With UIMA's support of resource sharing, I thought it would be straightforward to write to the file, or UIMA's cached version of the file in memory, for downstream annotators to use. With this approach, I could reuse my process flow wrapper annotator multiple times within an aggregate without needing to edit the descriptor. I was trying to avoid describing all the details, but this should help you better understand my scenario. >In this case it >might even be best to create the string for the second annotator on the >fly and send it directly rather than writing to the disk somewhere. I >think that if you have access to the code it would be better to treat >everything that changes from document to document as belonging on the >CAS and put all the configuration parameters in the AE descriptor. Yes, that may be the best approach given the current state of UIMA. If you have any further thoughts now that I've elaborated on my problem, I'd love to hear them. Thanks, Andrew