ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Eason <la...@iodinesoftware.com>
Subject Re: Running cTAKES through Java
Date Thu, 18 Jun 2015 16:24:37 GMT
Make sure you have a version of the resources unpacked (not in a jar) first
thing on the classpath.  See the instructions about installing the
resources here:
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+User+Install+Guide

My classpath looks like:

$APP_HOME/desc:$APP_HOME/resources:[all the other jars]

'desc' is where the pipeline definitions live, 'resources' is where a bunch
of miscellaneous resources (dictionaries, various ML models, etc.) live.

Also I notice the specific error you're getting is trying to load LVG.  I'd
strongly recommend removing LVG from your pipeline especially if you're
doing multi-threaded runs.  It's the only component in the standard
pipeline that isn't thread-safe and it's a huge performance sink to boot
for not much value add.

You can remove it by editing the pipeline XML and removing:

    <delegateAnalysisEngine key="LvgAnnotator">
      <import
location="../../../ctakes-lvg/desc/analysis_engine/LvgAnnotator.xml"/>
    </delegateAnalysisEngine>

and:

    <node>LvgAnnotator</node>

On Wed, Jun 17, 2015 at 8:58 PM, Jeff Headley <jeffunf96@gmail.com> wrote:

> Thank you for posting this code. I too am trying to run cTAKES from within
> a Java application. It works fine until the line:
> AnalysisEngine analysisEngine = UIMAFramework.produceAnalysisEngine(pipelineSpecifier,
> threadCount, 0);
>
> From there it is throwing the error below. My cTAKES installation is 3.2.2
> and I have setup UMLS credentials, etc. Have any ideas what is wrong?
>
> java.lang.IllegalArgumentException: URI is not hierarchical
> at java.io.File.<init>(File.java:418)
> at
> org.apache.ctakes.lvg.resource.LvgCmdApiResourceImpl.load(LvgCmdApiResourceImpl.java:65)
> at
> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:603)
> at
> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:442)
> at
> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:153)
> at
> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
> at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:123)
> at
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
> at
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
> at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
> at
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
> at
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
> at
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
> at
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
> at
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
> at
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
> at
> org.apache.uima.internal.util.ResourcePool.fillPool(ResourcePool.java:243)
> at org.apache.uima.internal.util.ResourcePool.<init>(ResourcePool.java:100)
> at
> org.apache.uima.internal.util.AnalysisEnginePool.<init>(AnalysisEnginePool.java:91)
> at
> org.apache.uima.analysis_engine.impl.MultiprocessingAnalysisEngine_impl.initialize(MultiprocessingAnalysisEngine_impl.java:118)
> at
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
> at
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
> at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:475)
>
> Thank you,
> Jeff
>
> On Tue, Jun 16, 2015 at 10:36 AM, Lance Eason <lance@iodinesoftware.com>
> wrote:
>
>> Sai, here's an example from what I'm using.  I'm using multiple threads
>> to process documents concurrently, if you're not interested in that you can
>> ignore the CASPool stuff and just instantiate a CAS directly.  You *do*
>> want to re-use CAS instances though, they're very expensive to create.
>>
>> // the name of the analysis engine xml file
>> String pipelineFileName =
>> ./desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml;
>>
>> // the number of simultaneous pipelines to support
>> int threadCount = 3;
>>
>> // load the pipeline specifier
>> XMLInputSource input = new XMLInputSource(new File(pipelineFileName));
>> ResourceSpecifier pipelineSpecifier =
>> UIMAFramework.getXMLParser().parseResourceSpecifier(input);
>>
>> // create the analysis engine for the pipeline and allocate some CAS
>> AnalysisEngine analysisEngine =
>> UIMAFramework.produceAnalysisEngine(pipelineSpecifier, threadCount, 0);
>> CasPool casPool = new CasPool(threadCount, analysisEngine);
>>
>>
>>
>> // for each document...
>> CAS cas = casPool.getCas();
>> try
>> {
>>     // process the document
>>     cas.reset();
>>     cas.setDocumentLanguage("en");
>>     cas.setDocumentText(textToAnalyze);
>>
>>     // then consume the assertions of whatever type you're interested in
>>     Type eventType =
>> cas.getTypeSystem().getType("org.apache.ctakes.typesystem.type.textsem.EventMention");
>>
>>     FSIterator<FeatureStructure> iter =
>> cas.getIndexRepository().getAllIndexedFS(eventType);
>>     while (iter.hasNext())
>>     {
>>         FeatureStructure fs = iter.next();
>>
>>         // extract information from the assertion
>>     }
>> }
>> finally
>> {
>>     casPool.releaseCas(cas);
>> }
>>
>> On Tue, Jun 16, 2015 at 2:37 AM, Sai Anuroop <sai.anuroop@abzooba.com>
>> wrote:
>>
>>> Hi All,
>>>
>>> I want to run cTAKES CPE by choosing a Collection Reader,AE and CAS
>>> Consumer from java directly so that i can reduce the time taken for
>>> processing text documents.Please can anyone explain how to do this by
>>> giving an example java code or point out to any resources.
>>>
>>> Regards,
>>>
>>> Vetsa Sai Anuroop
>>>
>>>
>>>
>>
>>
>> --
>> .........................................................
>> *Lance Eason*
>> Iodine Software
>> Vice President of Engineering
>> lance@iodinesoftware.com
>> 512.785.5195 office | 801.203.8987 fax
>> .........................................................
>>
>>
>


-- 
.........................................................
*Lance Eason*
Iodine Software
Vice President of Engineering
lance@iodinesoftware.com
512.785.5195 office | 801.203.8987 fax
.........................................................

Mime
View raw message