uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rashad <rash....@gmail.com>
Subject How to remove UIMA annotations?
Date Fri, 03 Jun 2011 10:55:32 GMT

I'm reasonably new to UIMA and trying to get it to do what I want. I'm 
attempting to perform entity extraction on 3 languages. I have an IF statement 
at the start of each Analysis engine which skips if the language of the 
document is not English for example - another AE detects the language to begin 

the next AE then tokenises this document (space tokeniser), next AE then 
extracts entities and CAS consumer then writes this to disk.

However I don't want to write ALL the space tokenised annotations to the disk 
aswell - only the extracted entities, as the files gets very large very 
quickly! Once a token has been processed I want it to be removed from the CAS/
jCAS, but token.removeFromIndexes() (I'm using Java) just throws a concurrent 
modification exception.

How do I get around this?

This is my code:

AnnotationIndex<Annotation> token = aJCas.getAnnotationIndex(Token.type);
	FSIterator<Annotation> timeIter = token.iterator();
	while (timeIter.hasNext()) {
		Token currentToken = (Token) timeIter.next();
		Token previousToken = null;
if (englishNamesAsTrie.search(currentToken.getToken().toLowerCase())) {
PersonName annotation = new PersonName(aJCas);

View raw message