From user-return-4838-apmail-uima-user-archive=uima.apache.org@uima.apache.org Wed Apr 3 13:04:21 2013 Return-Path: X-Original-To: apmail-uima-user-archive@www.apache.org Delivered-To: apmail-uima-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 82D45F4DA for ; Wed, 3 Apr 2013 13:04:21 +0000 (UTC) Received: (qmail 65478 invoked by uid 500); 3 Apr 2013 13:04:21 -0000 Delivered-To: apmail-uima-user-archive@uima.apache.org Received: (qmail 65281 invoked by uid 500); 3 Apr 2013 13:04:19 -0000 Mailing-List: contact user-help@uima.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@uima.apache.org Delivered-To: mailing list user@uima.apache.org Delivered-To: moderator for user@uima.apache.org Received: (qmail 13315 invoked by uid 99); 3 Apr 2013 10:48:52 -0000 X-ASF-Spam-Status: No, hits=3.2 required=5.0 tests=HTML_MESSAGE,MSGID_MULTIPLE_AT,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) From: "Thomas Gruber" To: Subject: UIMA subiterator Date: Wed, 3 Apr 2013 12:48:22 +0200 Message-ID: <000501ce3058$c34a12f0$49de38d0$@gruber@econob.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0006_01CE3069.86D2E2F0" X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: Ac4wWMHhNwv1krJmRlGfOzfcrUTqUw== Content-Language: de-at x-cr-hashedpuzzle: AqI/ AqkI A/Dg CN7P EH4e EPUS EdrN GNqg GSaF HS6d HcQL Hpt/ JLPw JNCV Jfy0 LJMH;1;dQBzAGUAcgBAAHUAaQBtAGEALgBhAHAAYQBjAGgAZQAuAG8AcgBnAA==;Sosha1_v1;7;{5E4E4289-1B7F-4143-8832-9EE7923B303F};dABoAG8AbQBhAHMALgBnAHIAdQBiAGUAcgBAAGUAYwBvAG4AbwBiAC4AYwBvAG0A;Wed, 03 Apr 2013 10:48:20 GMT;VQBJAE0AQQAgAHMAdQBiAGkAdABlAHIAYQB0AG8AcgA= x-cr-puzzleid: {5E4E4289-1B7F-4143-8832-9EE7923B303F} X-bounce-key: webpack.hosteurope.de;thomas.gruber@econob.com;1364986124;0afd7ee2; X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_0006_01CE3069.86D2E2F0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, I have a problem with the subiterator and would appreciate any help on this issue. I use uimafit 1.4.0 and uimaj-core 2.4.0 Subiterator in use is: http://uima.apache.org/d/uimaj-2.3.1/api/org/apache/uima/cas/text/Annotation Index.html#subiterator%28org.apache.uima.cas.text.AnnotationFS,%20boolean,%2 0boolean%29 I have an annotation "Sentence" which covers a sentence. Then I have an annotation "Value" which covers numerical values in a sentence. Example: "This is sentence A with no value. This is sentence B with value 377." causes - two "Sentence" annotations: "This is sentence A with no value." and "This is sentence B with value 377." - one "Value" annotation: "377" Now, if I want to get all "Value" annotations of each "Sentence", I iterate over the "Sentence"-anntotations and then use a subiterator to iterate over all "Value"-annotations within this sentence. The problem occurs if I set the "strict" parameter to "false". In this case, for the first Sentence the subiterator also returns the value of the second sentence, i.e. "377" is returned for the sentence "This is sentence A with no value." Normally, according to the javadoc, only annotations should returned with "annot < b and annot.getBegin() <= b.getBegin() <= annot.getEnd()". The (abstracted) code fragment looks as follow: Iterator it = JCasUtil.iterator(myCAS, SentenceAnnotation.class); while (it.hasNext()) { Sentence sa = it.next(); Iterator it3 = JCasUtil.iterator(sa, Value.class, false, false); while (it3.hasNext()) { Value tempAmount = it3.next(); if (tempAmount.getBegin() > sa.getEnd()) { System.out.println(" -------------------------------- ERROR ------------------------- This sentence covers a value which is out of bounds!"); } } } Thank you in advance. Best regards, Thomas ------=_NextPart_000_0006_01CE3069.86D2E2F0--