Actually, I just tried it with the Annotation Printer instead of Solrcas and
it got the same exception. I will back up and troubleshoot this by
inspecting the output of the MetaMapApiAE.
Dave
On Thu, Feb 3, 2011 at 12:06 PM, David Thibault
<david.r.thibault@gmail.com>wrote:
> Hello all,
>
> First off, I apologize for sending this to both the user and dev lists, but
> I'm not sure which list should get it. This is my first email to either
> list.
>
> I am working with UIMA and Solrcas and I'm getting this error:
> org.apache.uima.analysis_engine.AnalysisEngineProcessException
> at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:138)
> at
> org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
> at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
> at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
> at
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
> at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:897)
> at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:577)
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
> range: -1
> at java.lang.String.substring(String.java:1931)
> at
> org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:119)
> at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:126)
> ... 6 more
> org.apache.uima.analysis_engine.AnalysisEngineProcessException
> at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:138)
> at
> org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
> at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
> at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
> at
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
> at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:897)
> at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:577)
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
> range: -1
> at java.lang.String.substring(String.java:1931)
> at
> org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:119)
> at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:126)
> ... 6 more
>
> I edited SolrCASConsumer with the following lines right before line 126:
> Annotation fsTemp = (Annotation) fs;
> System.out.println("Processing Annotation: " + fsTemp.toString());
>
> Therefore, now right before it calls fs.getCoveredText() it prints this:
> Processing Annotation: Phrase
> sofa: _InitialView
> begin: -1
> end: 60
> candidates: FSArray
> mappings: FSArray
>
> Therefore, it's obvious why it's saying the string index is out of bounds.
> However, I'm not sure why it's getting those values from my analysis
> engine. I'm using MetaMapAEApi from the NIH's MetaMap project.
>
> This is the first phrase it is processing on this document and the first
> time int prints that subsection of debug tex. If I use the same AE in
> DocumentAnalyzer it correctly shows the first Document as starting on
> position 0 and ending on position 191, with the first phrase as being from
> positions 0 to 7.
>
> I'm trying to run this in the CPE GUI with the following CPEDescriptor.xml:
> <?xml version="1.0" encoding="UTF-8"?>
> <cpeDescription xmlns="http://uima.apache.org/resourceSpecifier">
> <collectionReader>
> <collectionIterator>
> <descriptor>
> <import
> location="../../../../../../../usr/local/apache-uima/examples/descriptors/collection_reader/FileSystemCollectionReader.xml"/>
> </descriptor>
> <configurationParameterSettings>
> <nameValuePair>
> <name>InputDirectory</name>
> <value>
>
> <string>/Users/davidt/Documents/workspace/BioSearch/resources/test_input</string>
> </value>
> </nameValuePair>
> </configurationParameterSettings>
> </collectionIterator>
> </collectionReader>
> <casProcessors casPoolSize="3" processingUnitThreadCount="1">
> <casProcessor deployment="integrated" name="MetaMapApiAE">
> <descriptor>
> <import location="../../../MetaMap UIMA
> Annotator/descriptors/MetaMapApiAE.xml"/>
> </descriptor>
> <deploymentParameters/>
> <errorHandling>
> <errorRateThreshold action="terminate" value="0/1000"/>
> <maxConsecutiveRestarts action="terminate" value="30"/>
> <timeout max="100000" default="-1"/>
> </errorHandling>
> <checkpoint batch="10000" time="1000ms"/>
> <configurationParameterSettings>
> <nameValuePair>
> <name>tempdir_path</name>
> <value>
> <string>/Users/davidt/tmp</string>
> </value>
> </nameValuePair>
> </configurationParameterSettings>
> </casProcessor>
> <casProcessor deployment="integrated" name="SolrcasAE.xml">
> <descriptor>
> <import
> location="../../../Apache_UIMA_Sandbox/Solrcas/desc/SolrcasAE.xml"/>
> </descriptor>
> <deploymentParameters/>
> <errorHandling>
> <errorRateThreshold action="terminate" value="0/1000"/>
> <maxConsecutiveRestarts action="terminate" value="30"/>
> <timeout max="100000" default="-1"/>
> </errorHandling>
> <checkpoint batch="10000" time="1000ms"/>
> </casProcessor>
> </casProcessors>
> <cpeConfig>
> <numToProcess>-1</numToProcess>
> <deployAs>immediate</deployAs>
> <checkpoint batch="0" time="300000ms"/>
> <timerImpl/>
> </cpeConfig>
> </cpeDescription>
>
> I'm at a loss as to where that -1 is coming from or how to debug it
> further. Any ideas would be greatly appreciated.
>
> Best,
> Dave
>
>
|