uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rogan Creswick <cresw...@gmail.com>
Subject Re: Tips for a beginner
Date Fri, 04 Dec 2009 06:41:47 GMT
On Thu, Dec 3, 2009 at 8:19 PM, Marshall Schor <msa@schor.com> wrote:
> PEARs are both a packaging, and an "isolation" capability.
> Packaging:  You get to zip up together potentially many components in an
> aggregate, including the JARs, and the classpaths, so that others can
> use the component.
> Isolation: If your Pear uses a particular version of some Jars, included
> in your PEAR package, UIMA runs these with an isolated class loader so
> that your versions of the Jars are kept isolated from other versions of
> the same Jars that might also be present in the rest of the system.

Does anyone know how the Annotator Parameters / Resources play into
PEAR packaging?

I have a need for "configurable" PEARs: For example, machine learning
algorithms that are self-contained in the PEAR, produce an agreed upon
annotation, and can be parameterized with different knowledge bases.

The goal is to reuse one PEAR to handle multiple data sets, or perform
a similar test task with assorted sets of training data.  Since the
implementation of the algorithm is separable from the training /
knowledge base, it just makes sense to maintain that separation at the
UIMA level too.

I've managed to hack around the UIMA API for parameters (which don't
seem to work for PEAR packaging) but the result is a bit fragile, and
quite difficult to debug.  In particular, overridden parameters in
aggregate annotators don't seem to "override" the component's
parameter defaults at runtime -- as far as I can tell, this takes
effect when the PEAR is packaged.


> This isolation may not be what you want, though. For more typical
> component assembly, you put components together without Pear isolation,
> using "plain" components and building normal aggregations of these.  In
> this approach, you have to insure that any Jars used by the components
> are at the same level.
> When you're all done, you can deliver the resulting aggregate as a Pear,
> if that is reasonable in your scenario, to others.
> HTH.   -Marshall
> William Colen wrote:
>> Hello,
>> We are moving our Brazilian Portuguese annotators (sentence detector,
>> tokenizer, tagger, parser) to UIMA. We have different implementations of
>> some annotators: some were created using OpenNLP, others where written from
>> scratch.
>> We would like to have .PEARs for each annotator, so in the applications we
>> would change the annotators easily. Also we don't want to have duplicated
>> resources (mostly dictionaries, and the UIMA typesystem descriptor), so we
>> need a way to share.
>> The first thing we did was to create the UIMA wrappers and descriptors. They
>> are ready and we were able to create a UIMA application that uses the
>> annotators (but we had to put all the code and descriptors in only one
>> Eclipse project. No .pear yet).
>> Now we are splitting the projects in smaller ones. For instance, we create
>> only one TypeSystem.xml and would like every project to use it. So I created
>> a simple .pear with the TypeSystem.xml.
>> After that I created another pear for the sentence detector, but I couldn't
>> import the TypeSystem.xml of the first pear. The only way to do that was
>> using the relative path or if I put the TypeSystem.xml inside the JAR file
>> of the first pear, and import it using classpath.
>> Do you know a better way to do that?
>> Latter I'll have to do the same with the dictionaries. I'm not sure if the
>> best approach is to create a .pear only for it. Will it work?
>> Thanks
>> William

View raw message