uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ekaterina Buyko <ekaterina.bu...@uni-jena.de>
Subject Re: Iterators in CAS
Date Fri, 12 Oct 2007 09:05:13 GMT
Hi Christian,

Thank you very much.

What I had orinally in mind would be a method in UIMA such as:
Sentence [] sentence = token.getOverlapAnnotation (Sentence.type);

But I have still some questions to your proposal:

If you get an iterator over all annotations, it is ok.
Do you know what is the order the annotations are in?

If I have for example the annotations (numbers are respective begin and end)
NP np (0,10)
Token token1(0,5), token2(6, 10)

Then I get index. How are they ordered?
np, token1, token2?

And what will be if they have the same span?
NP np (0,5)
Token token1(0,5)

With best regards

Katja



Christian Mauceri schrieb:
> Hi Ekaterina,
> 
> if I understood your question, it is possible and even a nice feature of 
> UIMA. I have more or less the same problems, I have two types of 
> annotations contexts and forms (sentences and token for you). So I have 
> TAEs which marks contexts and forms then I have another TAE (a CAS 
> consumer in my very simple case) which do the following.:
> 
>       // A context
>        TCollocation tc = null;
>       // A form
>        TForm f = null;
> 
>       // I first iter over all the annotations
>        Iterator annot = 
> jcas.getJFSIndexRepository().getAnnotationIndex().iterator();
>        while(annot.hasNext()) {
>            Annotation a = (Annotation)annot.next();
>             // then I test if it is a context TCollocation or a form TForm
>            if (a instanceof TCollocation) {
>                tc = (TCollocation)a;
>                //System.out.println(tc.getMatch());
>            } else if (a instanceof TForm) {
>                f = (TForm) a;
>            }
>        }
> 
> That's all the nice thing is that the iterator respects the position 
> order in the text and the inclusion hierarchy so you are sure the 
> current form belongs to the current context.
> 
> I hope it is helpfull and I did not say baloneys, at least works fine 
> for me.
> 
> Regards.
> Christian.
> 
> 
> Ekaterina Buyko wrote:
>> Hi all!
>>
>> In UIMA 2.1 it is possible to create a sub-iterator in order to 
>> iterate over annotations which are within the begin-end span of the 
>> selected type.
>>
>> For example:
>>
>> AnnotationIndex sentenceIndex = (AnnotationIndex) aJCas 
>> .getJFSIndexRepository().getAnnotationIndex(Sentence.type);
>>
>> AnnotationIndex tokenIndex = (AnnotationIndex) aJCas
>>                .getJFSIndexRepository().getAnnotationIndex(Token.type);
>>
>>        // iterate over Sentences
>>        FSIterator sentenceIterator = sentenceIndex.iterator();
>>        while (sentenceIterator.hasNext()) {
>>
>>            Sentence sentence = (Sentence) sentenceIterator.next();
>>
>>            // iterate over Tokens
>>            FSIterator tokenIterator = tokenIndex.subiterator(sentence);
>>
>>
>> I would like to have a more extended functionality. I need to know the 
>> annotations which are in the span of begin-end of the selected 
>> annotation type. These annotations can overlap the span of the 
>> selected type.
>>
>> For example noun phrases. If I iterate over tokens, I would like to 
>> know, if this token is inside a noun phrase or not. Now, I am working 
>> with Hashtables. But I am looking for an other solution.
>>
>> How could I solve this problem?
>>
>> Bets regards
>>
>> Ekaterina
>>
>>
>>
>>
> 


Mime
View raw message