uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <twgo...@gmx.de>
Subject Re: .addToIndexes() on subtype while iterating over supertype
Date Thu, 09 Oct 2008 16:22:02 GMT


Александър Л. Димитров wrote:
> Hello,
> 
> I have the following design issue with one of my AnalysisEngines. I searched the
> documentation, but not exhaustively, so pardon me if I'm doing something
> fundamentally wrong or there is an easy obvious solution.
> 
> I have a type T1 and its subtype T2. They would both span the same text in the
> CAS, and while T1 represents a general data structure, T2 represents a more
> specific one. Say, T1 represents a sentence and T2 a sentence of a certain kind.
> 
> In order to find out about all T2's in the text, I first have to find all T1's,
> then declare some T1's as T2's. Currently, I first mark up all T1's in an AE,
> then, in the next step and another AE, iterate over all T1's, look at their
> features and decide whether or not a T1 is a T2. In my particular example, I
> have to first do sentence boundary detection, then, after a few other AE's have
> done additional work, decide whether a particular sentence contains a trigger.
> 
> So, I iterate over T1:
> 
> final AnnotationIndex ai = cas.getAnnotationIndex(T1.class);
> for (final Iterator<T1> i = ai.iterator(); ai.hasNext(); ) {
>     final T1 t1 = ai.next(); // throws ConcurrentModificationException …
>     if (matchesDescription(t1)) {
> 	final T2 t2 = new T2(cas);
> 	doStuff(t2)
> 	t2.addToIndexes(); // … because we modified T1's indexes by adding a T2
> 	                   // to them
>     }
> }
> 
> As you can see, this code won't work, the Iterator's domain will be changed
> because the subclass shares an index 'pool' (or so) with the superclass.
> This means that the AnnotationIndex of cas.getAnnotationIndex(Foo.class) will
> always contain all instances of Foo.class in the CAS, *and* all instances of all
> the subclasses?

Yes.

> 
> Apart from just caching all T2's I want to add to the indexes in an ArrayList
> and then adding them after the iteration of T1's is finished, are there any
> other solutions? I wouldn't like to break up the semantic tie of inheritance
> between T1 and T2.

No, that is the recommended solution to this issue.  I don't
see anything wrong with it.  This is not specific to the CAS,
btw.  You always get into these kinds of issues when you try
to modify a collection that you're currently iterating over.

And you also may want to remove the old T1s from the index
as well, since they'll be replaced by the new T2s.  You also
need to do this in a separate step...

--Thilo

> 
> Thanks in advance,
> Aleks

Mime
View raw message