uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LeHouillier, Frank D." <Frank.LeHouill...@gd-ais.com>
Subject RE: read/write resource sharing
Date Wed, 29 Aug 2007 16:29:37 GMT
Is there a reason that the configuration parameters that are being read
by the last annotator can't be an annotation on the given document.  For
example, one can imagine a pipeline where there is an Analysis Engine
that checks the language of the document and then a separate
morphological tokenizer creates morpheme annotations using this
information.  The natural way to do now would be to set a new
DocumentAnnotation on the document with the appropriate language and
have the tokenizer AE read this.  Does the analysis engine you are
dealing with wrap something that requires an actual configuration file
or can it take for example a string argument instead?  In this case it
might even be best to create the string for the second annotator on the
fly and send it directly rather than writing to the disk somewhere.  I
think that if you have access to the code it would be better to treat
everything that changes from document to document as belonging on the
CAS and put all the configuration parameters in the AE descriptor.

-----Original Message-----
From: Andrew Shirk [mailto:shirk@ncsa.uiuc.edu] 
Sent: Wednesday, August 29, 2007 12:06 PM
To: uima-user@incubator.apache.org
Subject: read/write resource sharing

What is the best practice for sharing read/write resources amongst 
analysis engines in an aggregate? For example, say you have an 
annotator early in a flow that reads a configuration file off disk in 
order determine its behavior. Then, the next annotator does 
something, and needs to write changes to the configuration file so 
that another annotator downstream, whose behavior is also determined 
by the contents of the configuration file, can read in the resource 
that contains the changes.

Does this make sense?

Any help or ideas would be appreciated. I can think of some ugly 
hacks, but it would be nice to know if I'm missing some portion of 
the API that supports this type of scenario.

Thanks, Andrew

View raw message