incubator-ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Masanz, James J." <>
Subject RE: UIMA's subiterator
Date Thu, 09 Aug 2012 15:32:02 GMT
I think a UIMA subiterator make it too easy to introduce bugs.

I have never used org.uimafit.util.JCasUtil.selectCovered, but I prefer
the idea of that to a subiterator - provided the speed is not too bad.
For example, given the warning in the javadocs about the speed of 
selectCovered(JCas, Class, int, int)
I suggest we stay away from that but definitely try the one you
selectCovered(JCas, Class, Annotation)

James Masanz

> -----Original Message-----
> From: ctakes-dev-return-221-
> [mailto:ctakes-dev-return-
>] On Behalf Of Chen,
> Pei
> Sent: Thursday, August 09, 2012 10:18 AM
> To:
> Subject: UIMA's subiterator
> To get all the BaseTokens for a particular sentence, if we use the
> the types has be stored in the FSindexes in a certain order otherwise
it could
> just return an empty list.  This would require the users of annotators
> understand the ordering of types and have it preconfigured.
> FSIterator<Annotation> tokensInSentenceIterator =
> jcas.getAnnotationIndex(BaseToken.type).subiterator(sentence);
> uimaFIT already created a convenience method that seems to do
> similar which will always return the expected tokens.  Does anyone
know if
> this was part of the motivation?  Is the performance hit (if any)
worth the
> ease of use?
> Ex:
> List<BaseToken> tokens = org.uimafit.util.JCasUtil.selectCovered(jCas,
> BaseToken.class, sentence); Another alternative is UIMA's
> There are a few places that use subiterator in cTAKES and it's
tempting to use
> uimaFIT's JCasUtil.selecteCovered() instead... What do others think?
> Background: This issue surfaced when we use the cTAKES GUI (which uses
> uimaFIT to wire the components together instead of the Aggregate XML
> descriptor).
> --Pei

View raw message