Return-Path: Delivered-To: apmail-incubator-uima-user-archive@minotaur.apache.org Received: (qmail 20645 invoked from network); 24 Jun 2009 10:46:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 24 Jun 2009 10:46:28 -0000 Received: (qmail 47748 invoked by uid 500); 24 Jun 2009 10:46:39 -0000 Delivered-To: apmail-incubator-uima-user-archive@incubator.apache.org Received: (qmail 47709 invoked by uid 500); 24 Jun 2009 10:46:39 -0000 Mailing-List: contact uima-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: uima-user@incubator.apache.org Delivered-To: mailing list uima-user@incubator.apache.org Delivered-To: moderator for uima-user@incubator.apache.org Received: (qmail 73905 invoked by uid 99); 22 Jun 2009 11:36:56 -0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: CAS Multipliers and Pipeline Troubles Date: Mon, 22 Jun 2009 13:34:24 +0200 Message-ID: <0DBCCB475CDE864F8F6086D69BFC5D9F03A392D7@CALLISTO.ntdom.tk.informatik.tu-darmstadt.de> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: CAS Multipliers and Pipeline Troubles Thread-Index: AcnzLWSDDOKXv+qcTWq89LJgMxL7lQ== From: "Valkyrie Savage" To: X-PMX-TU: seen v0.62 by 5.5.3.366731, Antispam-Engine: 2.7.0.366912, Antispam-Data: 2009.6.22.112727 X-PMX-SPAMCHECK: outgoing mail X-Virus-Checked: Checked by ClamAV on apache.org Hello, all, I'm working on a project involving UIMA, and I've run into some = difficulties that I can't figure out. This is my first month working = with UIMA, so I am admittedly not well-versed in all its components and = interactions, but I'll try to describe my problem as best I can. I'm = running UIMA 2.2.2-incubating with Java 1.6 inside of Eclipse Ganymede. The project involves processing rather large documents, and the in-house = components that I'm using have difficulty reading in a book-length chunk = of text at a time. For this reason, I've developed a very simple CAS = multiplier; it takes in a CAS that contains Segment annotations and = generates a new CAS for each Segment. This multiplier is contained in = an aggregate AE, and the other components of the AE are used for adding = a few new annotations. At the end of the aggregate is a simple CAS = demultiplier; it is based heavily on the example in = org.apache.uima.examples.casMultiplier, except that I hardcoded the tags = that I want to copy across the demultiplying. The problem that I am coming across is that the split CASes are being = tagged correctly and merged correctly, but for whatever reason the = merged CAS is not the one that is being sent on through the rest of the = pipeline after this aggregate AE. I have a simple CAS printer running = at the end of the next() function of my demultiplier that shows that = only the tags that I wanted are being retained after the merge, but they = appear again if I add an AnnotationWriter in the next step of the = pipeline. I read about Flow Controllers, and it seems that the original = CAS should be dropped from the pipeline by default, since new CASes are = being created from it (I am not using any kind of user-defined Flow = Controller), but that doesn't seem to be happening. None of the new = tags added in the Aggregate AE are being preserved, but all the tags = that are supposed to be stripped out are being preserved. If there's more information needed, I'll be happy to provide it. As I = mentioned, I'm new to UIMA, and I'm not sure how to go about trying to = debug this. Thank you! Valkyrie Savage