uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <eck...@ukp.informatik.tu-darmstadt.de>
Subject Re: Obtaining the AnalysisEngine object from a primitive analysis component
Date Wed, 20 Feb 2013 22:11:15 GMT
Hi Shahim,

UIMA doesn't support the kind of scenario you describe. In the UIMA
world, components must exist independent of each other. Part of that
independence is that there is no way for a component to find out
what other components are in the pipeline. E.g. UIMA supports pipelines
where each component runs on a different server and they communicate via
network. A central control server may know about the full pipeline, but not
the components running on the processing servers - they only know about
themselves. What even the central server may not know is that a processing
server may actually run an aggregated analysis engine, not a primitive
engine, so what the central server thinks would be one component are 
actually several.

So far, I have seen two approaches to a scenario such as you describe:

1) Store meta data in the CAS: every component that processes the CAS
adds some special meta-data annotation(s) to the CAS. The last component
can read all these and record the information in a DB/RDF/whatever.

2) Execution wrapper : instead of passing your analysis description (AAE,
CPE or whatever) directly to UIMA for execution, you create a thin
wrapper around UIMA to which you pass the description. The wrapper
can access the description and record what you need before or after
passing on the description to UIMA for execution.

Solution 2) should be preferable, because it doesn't require the AEs 
to support any special meta-data annotations.

I recommend against continuing on the path you are currently on. It 
is extremely unlikely that the UIMA framework core will not be
changed to support access to the kind of information that you need.
You can hack yourself to the information you require using reflection,
but that will create extremely fragile code. When internal details of
the UIMA framework change in any future release, there is a good
chance your code will break. Nobody will try to make sure non-public
API or fields remain stable across releases. Your idea will also not
work in a distributed environment as outlined in the beginning of this


-- Richard

Am 20.02.2013 um 22:48 schrieb Shahim Essaid <shahim@essaid.com>:

> Hi Richard,
> Thank you for your answer. I was trying to avoid discussing the
> details of the use case to avoid a lengthy email like this one :-) I
> am still filling in some of the details of my use case but here is the
> general idea.
> I would like to add a primitive analysis engine to the end of any
> pipeline and have this engine record all the information about the
> pipeline and the CAS without additional configuration or application
> modification. So, this engine needs to get all the information from
> the specifier for the pipeline. This information will be recorded in
> RDF as information about what I am calling the "implementation of an
> analysis agent". An agent could be a person, an algorithm, a UIMA
> pipeline, etc. and the engine I am adding to the end of a UIMA
> pipeline will capture this information in a UIMA specific way and then
> write it as RDF that is based on a general ontology I am developing.
> Basically, I need to reach into the pipeline to collect as much
> metadata as possible. I was able to find the URL of the primitive
> specifiers in the configuration manager but I couldn't find the URL
> for the top level engine. Also, the URL might not be available if the
> pipeline is put together dynamically. I think what i need is the
> metadata object for the top resource but I can't see a way to get hold
> of that object. It is private but I can get it through reflection as
> long as I can get hold of the top resource for a pipeline instance.
> The log term goal that I am working on is the development of a generic
> OWL/RDF model of content, annotations, interpretations, etc. that can
> aggregate output from different tools with the main one being UIMA. I
> am an informatics and formal semantics researcher and many researchers
> and user in our domain need tools and solutions for generating and
> aggregating such data in a common way and I am attempting to do this
> in RDF. There are other pieces to my work but they are not directly
> related to my UIMA questions.
> Hope this helps clarify my earlier questions.
> Best,
> Shahim

Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab (UKP-TUD) 
FB 20 Computer Science Department      
Technische Universit├Ąt Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de

View raw message