uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From holmberg2066@comcast.net (g...@holmberg.name)
Subject Re: Spring factoryBean for producing AE: processors, consumer, readers and PEAR
Date Thu, 10 Jul 2008 04:34:13 GMT
Roberto--


I, for one, am interested.  I would like to use Spring with UIMA.

Something I haven't explored, and wonder how it would compare to the technique below, is to
combine IBM's support for OSGi in UIMA (http://www.alphaworks.ibm.com/tech/dmeuima) with Spring's
support for OSGi (http://springframework.org/node/704) and an OSGi implementation (http://felix.apache.org).

Or is that just making things unnecessarily complicated?  What would be the benefits compared
to using PEAR files?  Would I get a separate classloader for each annotator, so I won't have
class version collisions (such as with XML parsers)?


Greg Holmberg

 -------------- Original message ----------------------
From: "Roberto Franchini" <ro.franchini@gmail.com>
> Hi,
> I wrote some components usefull for integrate UIMA-components inside a
> Spring framework.
> This components are Spring FactoryBeans that are able to produce
> CasProcessors/Consumers , CollectionReaders and type systems.
> The production can be made "totally programmatically", from descriptor
> or a PEAR.
> I want to release this components to the community, if it sounds good.
> This works starts over code posted by Steven Bethard on this ml.
> Thank a lot Steven!
> 
> I give some use's examples:
> 
> <!-- collection reader -->
> 	<bean name="cr" class="it.celi.uima.bean.CollectionReaderFactoryBean"
> parent="baseAnnotator">
> 		<property name="componentClass"
> value="it.celi.components.collection.RecursiveFileSytemCollectionReader"
> />
> 		<property name="configurationParameters">
> 			<map>
> 				<entry key="application" value="language" />
> 				<entry key="language" value="it" />
> 			</map>
> 		</property>
> 	</bean>
> 
> where baseAnnotator is:
> 	<bean name="baseAnnotator"
> class="it.celi.uima.bean.AbstractUIMAComponentsFactoryBean"
> abstract="true">
> 		<property name="typeSystem" ref="typeSystem" />
> 	</bean>
> 
> 	<bean name="typeSystem" class="it.celi.uima.bean.TypeSytemFactoryBean">
> 		<property name="typeSytemPath"
> value="file:../dd4-typeSystem/src/main/resources/CeliTypeSystem.xml"
> />
> 	</bean>
> 	
> 
> Processor/consumers:
> 
> 	<bean name="sentenceAnnotator"
> class="it.celi.uima.bean.CasProcessorFactoryBean"
> parent="baseAnnotator">
> 		<property name="componentClass"
> value="it.celi.annotators.language.SentenceAnnotator" />
> 		<property name="configurationParameters">
> 			<map>
> 				<entry key="abbreviationsFiles" 
> value="abbreviations_*.txt" />
> 				<entry key="additionalSeparatorsFiles" 
> value="sentenceSeparators_*.txt" />
> 			</map>
> 		</property>
> 	</bean>
> 
> 	<bean name="xslSerializerCasConsumer"
> class="it.celi.uima.bean.CasConsumerFactoryBean"
> parent="baseAnnotator">
> 		<property name="componentClass"
> value="it.celi.components.consumer.XslSerializerCasConsumer" />
> 		<property name="configurationParameters">
> 			<map>
> 				<entry key="fileExtension" value=".xml" />
> 			</map>
> 		</property>
> 	</bean>
> 
> 
> PEAR files (configuraiton parameters override is not allowed!):
> 
> 	<bean name="japeAnnotator" 
> class="it.celi.uima.bean.CasProcessorFactoryBean">
> 		<property name="descriptorPath" 
> value="file:./pears/JapeAnnotator.pear" />
> 		<property name="redeployPear" value="true"/>
> 
> 		<property name="configurationParameters">
> 			<map>
> 			</map>
> 		</property>
> 	</bean>
> 
> from descriptor with params override:
> 
> 	<bean name="japeAnnotator" 
> class="it.celi.uima.bean.CasProcessorFactoryBean">
> 		<property name="descriptorPath" 
> value="file:./desc/RegExpTokenizer.xml" />
> 		<property name="configurationParameters">
> 			<map>
> 				<entry key="commandsFileName" 
> value="commands_tokenizer_*.xml" />
> 			</map>
> 		</property>
> 	</bean>
> 
> 
> A simple use case coul be:
> 
> Configuration:
> 
> <bean name="cpm" class="org.apache.uima.UIMAFramework"
> factory-method="newCollectionProcessingManager">
> 
> </bean>
> 
> 	<bean name="uimaCPM" class="it.celi.uima.engine.CpmUIMAEngine">
> 		<property name="cpm" ref="cpm" />
> 		<property name="listeners">
> 		</property>
> 		<property name="readers">
> 
> 			<list>
> 				<ref bean="rfcr" />
> 			</list>
> 		</property>
> 		<property name="processors">
> 			<list>
> 				<ref bean="sentenceAnnotator" />
> 				<ref bean="regExpTokenizer" />
> 				<ref bean="japeAnnotator" />
> 
> 			</list>
> 		</property>
> 		<property name="consumers">
> 			<list>
> 				<ref bean="xslSerializerCasConsumer" />
> 			</list>
> 		</property>
> 	</bean>
> 
> 
> The last element is a CPMWrapper that inside do this:
> 
> Methods to add consumers and processors to cpm (lists are injected by
> conf above):
> 
> 	private void addAllConsumersToCpm() {
> 		for (CasConsumer casConsumer : consumers) {
> 			String name = 
> casConsumer.getProcessingResourceMetaData().getName();
> 			try {
> 				logger.info("adding consumer to pipeline::" + 
> name);
> 				cpm.addCasConsumer(casConsumer);
> 
> 			} catch (ResourceConfigurationException e) {
> 
> 				logger.error("unable to add processor  :: " + 
> name, e);
> 			}
> 		}
> 
> 	}
> 
> 	private void addAllProcessorToCpm() {
> 		for (CasProcessor casProcessor : processors) {
> 			String name = 
> casProcessor.getProcessingResourceMetaData().getName();
> 
> 			try {
> 				logger.info("adding processor to pipeline::" + 
> name);
> 				cpm.addCasProcessor(casProcessor);
> 			} catch (ResourceConfigurationException e) {
> 				logger.error("unable to add processor  :: " + 
> name, e);
> 			}
> 		}
> 
> 	}
> 
> and then in a method can do:
> 
> 			cpm.setCollectionReader(reader);
> 			cpm.process();
> 
> 
> Some advantage:
> -only one simple file to configure a cpm
> -easy to inject components
> -easy to embed cpm/AE inside existing applications
> -can use SpringIDE inside Eclipse
> -....whatever?
> Disadvantage:
> -if you don't use Spring, there's another framework to learn
> -you can't use the Eclipse's UIMA plugins to edit/manage descriptors
> -Aggregate are not supported programmatically (via descriptors there's
> no problem)
> -....whatever?
> 
> Is it interesting? Let me now.
> 
> Roberto
> -- 
> Roberto Franchini
> http://www.celi.it
> http://www.blogmeter.it
> http://www.memesphere.it
> Tel +39-011-6600814
> jabber:ro.franchini@gmail.com skype:ro.franchini


Mime
View raw message