uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marshall Schor <...@schor.com>
Subject Re: ConceptMapper: tokenizer doesn't get PEAR classpath
Date Fri, 02 Mar 2012 22:05:51 GMT

On 3/2/2012 11:03 AM, Jens Grivolla wrote:
> Hi,
> when using the ConceptMapper from addons as a PEAR we are having classpath 
> problems.  The ConceptMapper launches a tokenizer AE using its XML descriptor, 
> but at that point the classpath set from the PEAR does not get used.

That is correct.

> This means that it is impossible to point to a tokenizer packaged together 
> with the CM based AE, or it is at least necessary to add the tokenizer classes 
> (or jar) as well as all of its dependencies to the global classpath.

There are some other possibilities.  One is to package the tokenizer as a PEAR 
and install it also, and then update the parameter which specifies the UIMA 
pipeline to run for tokenization to be a pearSpecifier.  In this case, the 
tokenizer would run with its classpath.

> It all seems to come down to (AnnotatorAdapter.java:97):
> ae = UIMAFramework.produceAnalysisEngine(aeSpecifier);
> But I don't see why the classpath that is used by the ConceptMapper would not 
> apply here. It must have to do with how the classpath is adjusted "locally" 
> for PEARs instead of being global to the whole JVM, but I haven't been able to 
> figure it out yet.

I think this is because the framework sets up a special class loader for the 
classes loaded by the PEAR's implementation class.  However, in this case that 
implementation class calls the UIMA Framework to produce an analysis engine - so 
the loader used for that is the one the UIMA Framework has.

I'm pretty sure it is possible to change the design of how Concept Mapper works, 
to have the tokenizer inherit the classpath (actually, the Resource Manager) of 
the Concept Mapper.  If done, this would have potential other implications - for 
example, it would be possible to have an external resource specification that 
was shared between the tokenizer and the Concept Mapper (and, indeed, if the 
Concept Mapper was contained in some outer UIMA Aggregate, with any annotator in 
that Aggregate.

This fix would entail capturing the resource manager instance that the concept 
mapper is running with, and passing that in to the framework call to produce the 
tokenizer resource.

Do people think this would be a good change or does it make things too complicated?

> Any ideas?
> Thanks,
> Jens

View raw message