uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Thibault <david.r.thiba...@gmail.com>
Subject Re: StringIndexOutOfBoundsException using Solrcas
Date Thu, 03 Feb 2011 18:08:51 GMT
Actually, I just tried it with the Annotation Printer instead of Solrcas and
it got the same exception.  I will back up and troubleshoot this by
inspecting the output of the MetaMapApiAE.

Dave


On Thu, Feb 3, 2011 at 12:06 PM, David Thibault
<david.r.thibault@gmail.com>wrote:

> Hello all,
>
> First off, I apologize for sending this to both the user and dev lists, but
> I'm not sure which list should get it.  This is my first email to either
> list.
>
> I am working with UIMA and Solrcas and I'm getting this error:
> org.apache.uima.analysis_engine.AnalysisEngineProcessException
>     at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:138)
>     at
> org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
>     at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
>     at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
>     at
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
>     at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:897)
>     at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:577)
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
> range: -1
>     at java.lang.String.substring(String.java:1931)
>     at
> org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:119)
>     at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:126)
>     ... 6 more
> org.apache.uima.analysis_engine.AnalysisEngineProcessException
>     at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:138)
>     at
> org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
>     at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:377)
>     at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:295)
>     at
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
>     at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:897)
>     at
> org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:577)
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
> range: -1
>     at java.lang.String.substring(String.java:1931)
>     at
> org.apache.uima.jcas.tcas.Annotation.getCoveredText(Annotation.java:119)
>     at
> org.apache.uima.solrcas.SolrCASConsumer.process(SolrCASConsumer.java:126)
>     ... 6 more
>
> I edited SolrCASConsumer with the following lines right before line 126:
>        Annotation fsTemp = (Annotation) fs;
>        System.out.println("Processing Annotation: " + fsTemp.toString());
>
> Therefore, now right before it calls fs.getCoveredText() it prints this:
> Processing Annotation: Phrase
>    sofa: _InitialView
>    begin: -1
>    end: 60
>    candidates: FSArray
>    mappings: FSArray
>
> Therefore, it's obvious why it's saying the string index is out of bounds.
> However, I'm not sure why it's getting those values from my analysis
> engine.  I'm using MetaMapAEApi from the NIH's MetaMap project.
>
> This is the first phrase it is processing on this document and the first
> time int prints that subsection of debug tex.  If I use the same AE in
> DocumentAnalyzer it correctly shows the first Document as starting on
> position 0 and ending on position 191, with the first phrase as being from
> positions 0 to 7.
>
> I'm trying to run this in the CPE GUI with the following CPEDescriptor.xml:
> <?xml version="1.0" encoding="UTF-8"?>
> <cpeDescription xmlns="http://uima.apache.org/resourceSpecifier">
>     <collectionReader>
>         <collectionIterator>
>             <descriptor>
>                 <import
> location="../../../../../../../usr/local/apache-uima/examples/descriptors/collection_reader/FileSystemCollectionReader.xml"/>
>             </descriptor>
>             <configurationParameterSettings>
>                 <nameValuePair>
>                     <name>InputDirectory</name>
>                     <value>
>
> <string>/Users/davidt/Documents/workspace/BioSearch/resources/test_input</string>
>                     </value>
>                 </nameValuePair>
>             </configurationParameterSettings>
>         </collectionIterator>
>     </collectionReader>
>     <casProcessors casPoolSize="3" processingUnitThreadCount="1">
>         <casProcessor deployment="integrated" name="MetaMapApiAE">
>             <descriptor>
>                 <import location="../../../MetaMap UIMA
> Annotator/descriptors/MetaMapApiAE.xml"/>
>             </descriptor>
>             <deploymentParameters/>
>             <errorHandling>
>                 <errorRateThreshold action="terminate" value="0/1000"/>
>                 <maxConsecutiveRestarts action="terminate" value="30"/>
>                 <timeout max="100000" default="-1"/>
>             </errorHandling>
>             <checkpoint batch="10000" time="1000ms"/>
>             <configurationParameterSettings>
>                 <nameValuePair>
>                     <name>tempdir_path</name>
>                     <value>
>                         <string>/Users/davidt/tmp</string>
>                     </value>
>                 </nameValuePair>
>             </configurationParameterSettings>
>         </casProcessor>
>         <casProcessor deployment="integrated" name="SolrcasAE.xml">
>             <descriptor>
>                 <import
> location="../../../Apache_UIMA_Sandbox/Solrcas/desc/SolrcasAE.xml"/>
>             </descriptor>
>             <deploymentParameters/>
>             <errorHandling>
>                 <errorRateThreshold action="terminate" value="0/1000"/>
>                 <maxConsecutiveRestarts action="terminate" value="30"/>
>                 <timeout max="100000" default="-1"/>
>             </errorHandling>
>             <checkpoint batch="10000" time="1000ms"/>
>         </casProcessor>
>     </casProcessors>
>     <cpeConfig>
>         <numToProcess>-1</numToProcess>
>         <deployAs>immediate</deployAs>
>         <checkpoint batch="0" time="300000ms"/>
>         <timerImpl/>
>     </cpeConfig>
> </cpeDescription>
>
> I'm at a loss as to where that -1 is coming from or how to debug it
> further.  Any ideas would be greatly appreciated.
>
> Best,
> Dave
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message