uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Eckart de Castilho <...@apache.org>
Subject Re: uimafit - String[] parameter in Resource_ImplBase
Date Fri, 13 Jun 2014 09:25:59 GMT
Hi Armin,

I'll answer in-line below.

On 04.06.2014, at 10:04, Armin.Wegner@bka.bund.de wrote:

> Hello Richard!
> I would like to have a writer that writes all mentions of a given type. The type is given
by name as a AE parameter. The way the mentions are formatted should be interchangeable. So
the formatter varies and should be encapsulated as a AE resource (or maybe not?).
> public class AnnotationWriter extends CasConsumer_ImplBase {
> 	public static final String PARA_TYPE_NAME = "typeName";
> 	/**
> 	 * The name of the type whose mentions are to be written.
> 	 */
> 	@ConfigurationParameter(name = PARA_TYPE_NAME, mandatory = true)
> 	private String mTypeName;
> 	/**
> 	 * The type whose mentions are to be written.
> 	 */
> 	private Type mType;
> 	public static final String RES_ANNOTATION_FORMATTER = "annotationFormatter";
> 	@ExternalResource(key = RES_ANNOTATION_FORMATTER, mandatory = true)
> 	private AnnotationFormatter mFormatter;
> 	@Override
> 	public void typeSystemInit(final TypeSystem typeSystem) throws AnalysisEngineProcessException
> 		super.typeSystemInit(typeSystem);
> 		mType = typeSystem.getType(typeName);
> 	}
> 	@Override
> 	public final void process(final CAS cas) throws AnalysisEngineProcessException {
> 		/*
> 		 * Write all annotations of the given type.
> 		 */
> 		try (final Writer writer = // build a writer) {
> 			for (final AnnotationFS annotation : CasUtil.select(cas, type)) {
> 				writer.append(mAnnotationFormatter.format(annotation));
> 			}
> 		} catch (IOException cause) {
> 			throw new AnalysisEngineProcessException(cause);
> 		}
> 	}
> }
> This is the interface for all formatters.
> public interface AnnotationFormatter {
> 	String format(final AnnotationFS annotation);
> }
> This is a concrete implementation of a formatter. The problem is that this is not an
external resource. There is no file, no dictionary, no data base connection, or what ever.
It is just a simple object. Most likely, this is not how a UIMA resource should be used.

You are probably right that the original authors of the shard resources mechanism didn't have
the use-case of using a the shared resources as a generic strategy pattern in mind. I personally
think that this is a perfectly valid use-case.

> public class TsvAnnotationFormatter extends Resource_ImplBase implements AnnotationFormatter
> 	public static final String PARA_FEATURE_NAMES = "featureNames";
> 	/**
> 	 * This would be nice but does not work.
>         	 */
> 	@ConfigurationParameter(name = PARA_FEATURE_NAMES, mandatory = true)
> 	private String[] mFeatureNames;
> 	@Override
> 	public final String format(final AnnotationFS annotation) {
> 		// Pretty print the given features' values.
> 	}
> } 
> As you said, String[] works fine with SharedResourceObject. But SharedResourceObject
demands a real resource to be loaded which I don't have.

I would see that differently. The SharedResourceObject *allows* to load a real resources,
but you can choose to pass in dummy value (maybe even null). But I feel with you. Since some
time, I have been playing with the idea of changing the uimaFIT Resource_ImplBase to implement
SharedResourceObject (for the better parameter support) and turning the load(DataResource
aData) basically into a no-op so that subclasses do not have to implement it.

> There is a simple solution to this: Omit the pseudo resource and make featureNames a
parameter of AnnotationWriter. I can still use the formatter interface but only internally
to the writer. But I have to code a new writer for each annotation formatter. That works fine
but is not the kind of modularization I would like to have.

There is another way of implementing strategies that are not aware of UIMA and that do not
implement any of the UIMA interfaces. uimaFIT provides the concept of a ResourceLocator. Such
a locator is basically a factory that knows how to instantiate the kind of non-UIMA objects
that you want to use in your analysis engine. An example is given in Section 7.2 of the uimaFIT
manual [1].

I have been playing with the idea of implementing a generic JavaBeanInjector based on the
ResourceLocator support, e.g.

AnalysisEngineDescription desc = createEngineDescription(
    MyAnalysisEngine2.RES_BEAN, createExternalResourceDescription(JavaBeanInjector.class,
      JavaBeanInjector.PARAM_CLASS, MyJavaBeanClass.class
      "field1", "value1",

The AE would look like this:

  static final String RES_BEAN = "bean";
  @ExternalResource(key = RES_BEAN)
  MyJavaBeanClass bean; // Could also use an interface here

It should be easy to implement such a JavaBeanInjector if you want to try this approach. Basically
the JavaBeanInjector would use the Java reflection API to instantiate the class and fill in
the fields. Using helpers from Spring or Apache Commons, it should be possible to do this
with just a few lines of code.

What do you think?


-- Richard

[1] http://uima.apache.org/d/uimafit-current/tools.uimafit.book.html#d5e519

View raw message