ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Headley <jeffun...@gmail.com>
Subject Re: Running cTAKES through Java
Date Tue, 30 Jun 2015 14:48:36 GMT
Thanks Bruce for the help. I think I’m experiencing something similar. Even though I’m
using <scope>provided</scope>, the cTAKES jars are ending up in my jar. I think
I need to start over and maybe not use Maven and/or Spring Boot. Nothing I’m trying is keeping
the cTAKES jars out of my jar’s lib folder.

> On Jun 29, 2015, at 10:17 AM, Bruce Tietjen <bruce.tietjen@perfectsearchcorp.com>
wrote:
> 
> If you can run your process in a debugger like eclipse, then you can suspend execution
during that 12 minutes and check the stack to see what is happening. 
> 
> When I experienced similar behavior, the Dictionary Lookup was reading the database files
from a .JAR file that was in my .m2 (maven) repository. The easiest way I found to avoid this
happening was to delete or rename the file from my .m2 directory.  This is very annoying because
rebuilding will re-download the files and I have to do it again. (If there is a better way,
I would love to hear about it.)
> 
> 
> 
> 
>  <http://imatsolutions.com/> Bruce Tietjen
> Senior Software Engineer
>  801.634.1547
> bruce.tietjen@imatsolutions.com <mailto:bruce.tietjen@imatsolutions.com>   
> 
> On Sat, Jun 27, 2015 at 9:08 PM, Jeff Headley <jeffunf96@gmail.com <mailto:jeffunf96@gmail.com>>
wrote:
> I was able to get by the error by modifying my installation's DictionaryLookupAnnotatorUMLS.xml
file. I changed:
> <fileUrl>file:org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml</fileUrl>
> 
> to
> <fileUrl>file:resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml</fileUrl>
> 
> and that seemed to work.
> 
> I saw only a slight performance improvement however. Would anyone be able to tell me
what is going on between these two log statements that takes about 12 minutes?
> 
> 2015-06-27 22:45:02.374  INFO 8972 --- [           main] .a.c.d.l.a.UmlsDictionaryLookupAnnotator
: process(JCas)
> 2015-06-27 22:57:39.385  INFO 8972 --- [           main] o.a.c.c.parser.MaxentParserWrapper
      : Started processing: null
> 
> On Sat, Jun 27, 2015 at 12:45 PM, Jeff Headley <jeffunf96@gmail.com <mailto:jeffunf96@gmail.com>>
wrote:
> I have changed my cTAKES dependencies in my pom back to <scope>provided</scope>
and I think I have the classpath set correctly as it seems to start out ok but eventually
gets this new error. I'm hoping maybe someone has seen this before and can help me out. I
believe my cTAKES is installed correctly. I followed the guide and can use the CVD. The analysis
engine I'm attempting to load is desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml.
> 
> 2015-06-27 12:36:07.425 DEBUG 10332 --- [           main] o.a.ctakes.core.ae.OverlapAnnotator
     : Overlap bitset: {3}
> 2015-06-27 12:36:07.453  INFO 10332 --- [           main] o.a.c.d.p.ae.ClearNLPDependencyParserAE
 : using Morphy analysis? true
> Loading configuration.
> Loading feature templates.
> Loading lexica.
> Loading model:
> ........................................................................................
> 2015-06-27 12:36:16.930  INFO 10332 --- [           main] org.apache.ctakes.chunker.ae.Chunker
    : Chunker model file: org/apache/ctakes/chunker/models/chunker-model.zip
> 2015-06-27 12:36:17.952  INFO 10332 --- [           main] c.c.a.ContextDependentTokenizerAnnotator
: Finite state machines loaded.
> 2015-06-27 12:36:17.959  INFO 10332 --- [           main] o.a.c.c.parser.ae.ConstituencyParser
    : Initializing parser...
> 2015-06-27 12:36:20.616  INFO 10332 --- [           main] o.a.ctakes.necontexts.ContextAnnotator
  : SCOPE ORDER: [1, 3]
> 2015-06-27 12:36:20.619  INFO 10332 --- [           main] o.a.c.n.n.NegationContextAnalyzer
       : initBoundaryData() called for ContextInitializer
> 2015-06-27 12:36:20.758  INFO 10332 --- [           main] org.apache.ctakes.postagger.POSTagger
   : POS tagger model file: org/apache/ctakes/postagger/models/mayo-pos.zip
> 2015-06-27 12:36:21.061 ERROR 10332 --- [           main] c.e.c.processors.CommandLineProcessor
   : ResourceInitializationException: 
> 
> org.apache.uima.resource.ResourceInitializationException: Error initializing "org.apache.uima.resource.impl.DataResource_impl"
from descriptor file:/D:/java/apache-ctakes-3.2.2/desc/ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml.
> 	at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144)
> 	at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
> 	at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
> 	at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:243)
> 	at org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:565)
> 	at org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:442)
> 	at org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:153)
> 	at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
> 	at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:123)
> 	at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
> 	at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
> 	at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
> 	at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
> 	at org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
> 	at 
> .
> .
> .
> Caused by: org.apache.uima.resource.ResourceInitializationException: Could not access
the resource data at file:org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml.
> 	at org.apache.uima.resource.impl.DataResource_impl.initialize(DataResource_impl.java:127)
> 	at org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123)
> 	... 35 common frames omitted
> 
> On Fri, Jun 26, 2015 at 9:46 AM, Bruce Tietjen <bruce.tietjen@perfectsearchcorp.com
<mailto:bruce.tietjen@perfectsearchcorp.com>> wrote:
> I'm sorry I don't have any current numbers for running that pipeline because we need
more than just entity recognition. We also need polarity, certainty, etc.
> 
> We have done a lot of optimization work in the more expensive parts of the pipeline and
have made modifications to some areas to make them thread safe to enable running multiple
pipelines concurrently within the same process. We have also made changes so most of the models
that are loaded can be shared across multiple pipelines.
> 
> We have not had time and resources to share these changes with the community yet, but
intend to make our changes available to the community as soon as we feel they are ready.
> 
> 
>  <http://imatsolutions.com/> Bruce Tietjen
> Senior Software Engineer
>  801.634.1547 <tel:801.634.1547>
> bruce.tietjen@imatsolutions.com <mailto:bruce.tietjen@imatsolutions.com>   
> 
> On Thu, Jun 25, 2015 at 11:43 PM, Sai Anuroop <sai.anuroop@abzooba.com <mailto:sai.anuroop@abzooba.com>>
wrote:
> Hi All,
> I am presently working with developer version of cTAKES in Windows through eclipse.
> @Jeff:Thanks for your reply.
> @Lance:I am new to cTAKES and Java.So please Can you give me the code which runs cTAKES
CPE in background without opening the CUI and produces XML output.If the code given does the
same then can you please tell where to create above java class(in which project).
> @Bruce:Thanks for your posts.Can you tell What is the average and best time of cTAKES
analyzing say a 20 line discharge report using AggregatePlaintextFastUMLSProcessor.
> 
> Regards,
> Vetsa Sai Anuroop
> 
> 
> 
> 
> 


Mime
View raw message