ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Tietjen <bruce.tiet...@perfectsearchcorp.com>
Subject Re: Running cTAKES through Java
Date Mon, 29 Jun 2015 14:17:13 GMT
If you can run your process in a debugger like eclipse, then you can
suspend execution during that 12 minutes and check the stack to see what is
happening.

When I experienced similar behavior, the Dictionary Lookup was reading the
database files from a .JAR file that was in my .m2 (maven) repository. The
easiest way I found to avoid this happening was to delete or rename the
file from my .m2 directory.  This is very annoying because rebuilding will
re-download the files and I have to do it again. (If there is a better way,
I would love to hear about it.)




 [image: IMAT Solutions] <http://imatsolutions.com>
 Bruce Tietjen
Senior Software Engineer
[image: Mobile:] 801.634.1547
bruce.tietjen@imatsolutions.com

On Sat, Jun 27, 2015 at 9:08 PM, Jeff Headley <jeffunf96@gmail.com> wrote:

> I was able to get by the error by modifying my installation's
> DictionaryLookupAnnotatorUMLS.xml file. I changed:
>
> <fileUrl>file:org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml</fileUrl>
>
> to
>
> <fileUrl>file:resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml</fileUrl>
>
> and that seemed to work.
>
> I saw only a slight performance improvement however. Would anyone be able
> to tell me what is going on between these two log statements that takes
> about 12 minutes?
>
> 2015-06-27 22:45:02.374  INFO 8972 --- [           main]
> .a.c.d.l.a.UmlsDictionaryLookupAnnotator : process(JCas)
> 2015-06-27 22:57:39.385  INFO 8972 --- [           main]
> o.a.c.c.parser.MaxentParserWrapper       : Started processing: null
>
> On Sat, Jun 27, 2015 at 12:45 PM, Jeff Headley <jeffunf96@gmail.com>
> wrote:
>
>> I have changed my cTAKES dependencies in my pom back to
>> <scope>provided</scope> and I think I have the classpath set correctly
as
>> it seems to start out ok but eventually gets this new error. I'm hoping
>> maybe someone has seen this before and can help me out. I believe my cTAKES
>> is installed correctly. I followed the guide and can use the CVD. The
>> analysis engine I'm attempting to load
>> is desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml.
>>
>> 2015-06-27 12:36:07.425 DEBUG 10332 --- [           main]
>> o.a.ctakes.core.ae.OverlapAnnotator      : Overlap bitset: {3}
>> 2015-06-27 12:36:07.453  INFO 10332 --- [           main]
>> o.a.c.d.p.ae.ClearNLPDependencyParserAE  : using Morphy analysis? true
>> Loading configuration.
>> Loading feature templates.
>> Loading lexica.
>> Loading model:
>>
>> ........................................................................................
>> 2015-06-27 12:36:16.930  INFO 10332 --- [           main]
>> org.apache.ctakes.chunker.ae.Chunker     : Chunker model file:
>> org/apache/ctakes/chunker/models/chunker-model.zip
>> 2015-06-27 12:36:17.952  INFO 10332 --- [           main]
>> c.c.a.ContextDependentTokenizerAnnotator : Finite state machines loaded.
>> 2015-06-27 12:36:17.959  INFO 10332 --- [           main]
>> o.a.c.c.parser.ae.ConstituencyParser     : Initializing parser...
>> 2015-06-27 12:36:20.616  INFO 10332 --- [           main]
>> o.a.ctakes.necontexts.ContextAnnotator   : SCOPE ORDER: [1, 3]
>> 2015-06-27 12:36:20.619  INFO 10332 --- [           main]
>> o.a.c.n.n.NegationContextAnalyzer        : initBoundaryData() called for
>> ContextInitializer
>> 2015-06-27 12:36:20.758  INFO 10332 --- [           main]
>> org.apache.ctakes.postagger.POSTagger    : POS tagger model file:
>> org/apache/ctakes/postagger/models/mayo-pos.zip
>> 2015-06-27 12:36:21.061 ERROR 10332 --- [           main]
>> c.e.c.processors.CommandLineProcessor    : ResourceInitializationException:
>>
>> org.apache.uima.resource.ResourceInitializationException: Error
>> initializing "org.apache.uima.resource.impl.DataResource_impl" from
>> descriptor
>> file:/D:/java/apache-ctakes-3.2.2/desc/ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml.
>> at
>> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144)
>> at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
>> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:243)
>> at
>> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:565)
>> at
>> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:442)
>> at
>> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:153)
>> at
>> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
>> at
>> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:123)
>> at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>> at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
>> at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
>> at
>> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
>> at
>> .
>> .
>> .
>> Caused by: org.apache.uima.resource.ResourceInitializationException:
>> Could not access the resource data at
>> file:org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml.
>> at
>> org.apache.uima.resource.impl.DataResource_impl.initialize(DataResource_impl.java:127)
>> at
>> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123)
>> ... 35 common frames omitted
>>
>> On Fri, Jun 26, 2015 at 9:46 AM, Bruce Tietjen <
>> bruce.tietjen@perfectsearchcorp.com> wrote:
>>
>>> I'm sorry I don't have any current numbers for running that pipeline
>>> because we need more than just entity recognition. We also need polarity,
>>> certainty, etc.
>>>
>>> We have done a lot of optimization work in the more expensive parts of
>>> the pipeline and have made modifications to some areas to make them thread
>>> safe to enable running multiple pipelines concurrently within the same
>>> process. We have also made changes so most of the models that are loaded
>>> can be shared across multiple pipelines.
>>>
>>> We have not had time and resources to share these changes with the
>>> community yet, but intend to make our changes available to the community as
>>> soon as we feel they are ready.
>>>
>>>
>>>  [image: IMAT Solutions] <http://imatsolutions.com>
>>>  Bruce Tietjen
>>> Senior Software Engineer
>>> [image: Mobile:] 801.634.1547
>>> bruce.tietjen@imatsolutions.com
>>>
>>> On Thu, Jun 25, 2015 at 11:43 PM, Sai Anuroop <sai.anuroop@abzooba.com>
>>> wrote:
>>>
>>>> Hi All,
>>>> I am presently working with developer version of cTAKES in Windows
>>>> through eclipse.
>>>> @Jeff:Thanks for your reply.
>>>> @Lance:I am new to cTAKES and Java.So please Can you give me the code
>>>> which runs cTAKES CPE in background without opening the CUI and produces
>>>> XML output.If the code given does the same then can you please tell where
>>>> to create above java class(in which project).
>>>> @Bruce:Thanks for your posts.Can you tell What is the average and best
>>>> time of cTAKES analyzing say a 20 line discharge report
>>>> using AggregatePlaintextFastUMLSProcessor.
>>>>
>>>> Regards,
>>>>
>>>> Vetsa Sai Anuroop
>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message