ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vogel, James" <JVo...@activehealth.net>
Subject RE: Package for use with Solr
Date Fri, 06 Sep 2013 13:24:32 GMT
What is the recommended way to deploy cTAKES for processing of a large number of documents
on a regular basis while dealing with the single thread restriction?

Can anyone point me to any examples of using SolrCas with cTAKES?

-----Original Message-----
From: Pei Chen [mailto:chenpei@apache.org]
Sent: Thursday, September 05, 2013 2:40 PM
To: user@ctakes.apache.org
Subject: Re: Package for use with Solr

James,
Also- If you plan to run this in the same JVM process as Solr or other
multithreaded webapp setup-
be sure to look out for Jira CTAKES-151 [1]
https://issues.apache.org/jira/browse/CTAKES-151

On Thu, Sep 5, 2013 at 10:38 AM, Pei Chen <chenpei@apache.org> wrote:
> James,
> Ensure that dir is in your classpath. To test out the theory, try
> making that path an {absolute_path} instead of res.
>
> Note, if you're trying to run this under a servlet container, the
> webapp could have a different class loader than the container.
>
> On Thu, Sep 5, 2013 at 9:37 AM, Vogel, James <JVogel@activehealth.net> wrote:
>> hsqldb is throwing java.sql.SQLException: File input/output error:
>> java.io.IOException: Stream closed at the following code in
>> JdbcConnectionResourceImpl. load:
>>
>>
>>
>> iv_conn = DriverManager.getConnection(
>>
>>               urlStr,
>>
>>               username,
>>
>>               password);
>>
>>
>>
>> urlStr is:
>> jdbc:hsqldb:res:org/apache/ctakes/dictionary/lookup/umls2011ab/umls
>>
>> username is: SA
>>
>> password is: blank
>>
>>
>>
>> That path is expanded as regular files along with the other resources off of
>> my default directory and on the classpath.  Is there something special about
>> the placement of the hsqldb files or something I need to do to enable use of
>> hsqldb?
>>
>>
>>
>> Exception details:
>>
>>
>>
>> Caused by: org.apache.uima.resource.ResourceInitializationException
>>
>>                 at
>> org.apache.ctakes.core.resource.JdbcConnectionResourceImpl.load(JdbcConnectionResourceImpl.java:130)
>>
>>                 at
>> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:611)
>>
>>                 at
>> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:450)
>>
>>                 at
>> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:182)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:123)
>>
>>                 at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>>
>>                 at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
>>
>>                 at
>> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:255)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:429)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:186)
>>
>>                 at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>>
>>                 at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:354)
>>
>>                 at
>> org.apache.lucene.analysis.uima.ae.BasicAEProvider.getAE(BasicAEProvider.java:73)
>>
>>                 at
>> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText(UIMAUpdateRequestProcessor.java:155)
>>
>>                 at
>> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:80)
>>
>>                 ... 40 more
>>
>> Caused by: java.sql.SQLException: File input/output error:
>> java.io.IOException: Stream closed
>>
>>                 at org.hsqldb.jdbc.Util.sqlException(Unknown Source)
>>
>>                 at org.hsqldb.jdbc.jdbcConnection.<init>(Unknown Source)
>>
>>                 at org.hsqldb.jdbcDriver.getConnection(Unknown Source)
>>
>>                 at org.hsqldb.jdbcDriver.connect(Unknown Source)
>>
>>                 at java.sql.DriverManager.getConnection(Unknown Source)
>>
>>                 at java.sql.DriverManager.getConnection(Unknown Source)
>>
>>                 at
>> org.apache.ctakes.core.resource.JdbcConnectionResourceImpl.load(JdbcConnectionResourceImpl.java:109)
>>
>>                 ... 60 more
>>
>>
>>
>> From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
>>
>> Sent: Thursday, August 29, 2013 5:57 PM
>> To: <user@ctakes.apache.org>
>> Cc: user@ctakes.apache.org
>>
>> Subject: Re: Package for use with Solr
>>
>>
>>
>> Sorry if i was unclear- i meant Only resources need to be unpacked.
>> What was the full stack trace?
>>
>>
>> Sent from my iPhone
>>
>>
>> On Aug 29, 2013, at 5:52 PM, "Vogel, James" <JVogel@activehealth.net> wrote:
>>
>> I do, including ctakes-type-system-3.0.0-incubating.jar.  I thought you said
>> that the xml files needed to be unpacked rather than just in jars so I was
>> assuming the reason it wasn't being found had to do with that.
>>
>>
>>
>> From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
>> Sent: Thursday, August 29, 2013 5:43 PM
>> To: user@ctakes.apache.org
>> Subject: RE: Package for use with Solr
>>
>>
>>
>> James,
>>
>> Do you have all of the ctakes jars in your classpath?
>>
>> i.e:
>>
>> ctakes-type-system-{version}.jar in your classpath?
>>
>>
>>
>> --Pei
>>
>>
>>
>>
>>
>> From: Vogel, James [mailto:JVogel@activehealth.net]
>> Sent: Thursday, August 29, 2013 5:37 PM
>> To: user@ctakes.apache.org
>> Subject: RE: Package for use with Solr
>>
>>
>>
>> Is there something special about where
>> org\apache\ctakes\typesystem\types\TypeSystem.xml needs to be placed
>> relative to the other files in the binary distribution? I put a 'resources'
>> folder on the path containing
>> org\apache\ctakes\typesystem\types\TypeSystem.xml but it isn't being  found.
>>
>>
>>
>> From: Pei Chen [mailto:chenpei@apache.org]
>> Sent: Wednesday, August 28, 2013 1:43 PM
>> To: user@ctakes.apache.org
>> Subject: Re: Package for use with Solr
>>
>>
>>
>> There is a current limitation where the resources need to be unpacked...
>> (Lucene doesn't like the indexes being inside the compressed jar's.)  Try
>> unpacking the resources and adding resources to your classpath...
>>
>> --Pei
>>
>>
>>
>> On Wed, Aug 28, 2013 at 12:59 PM, Vogel, James <JVogel@activehealth.net>
>> wrote:
>>
>> I renamed apache-ctakes-3.1.0-SNAPSHOT-bin.zip to jar and created a jar
>> containing the contents of ctakes-dictionary-lookup\resources and put them
>> on the classpath.  When I don't add the resources jar I get an error because
>> it can't find those files.  Once I get past that, I get a
>> java.lang.IllegalArgumentException: URI is not hierarchical exception, full
>> stack trace below.  I saw some posts from 2012
>> (http://ctakes.markmail.org/search/?q=URI+is+not+hierarchical#query:URI%20is%20not%20hierarchical+page:1+mid:yaqtqzbylwdyy35n+state:results)
>> where you wrote about a similar problem.  Any suggestions on why this
>> happens?
>>
>>
>>
>> 2013-08-28 16:41:51,692 ERROR core.SolrCore -
>> org.apache.solr.common.SolrException: processing error URI is not
>> hierarchical.
>> id=file:/C:/Users/jvogel/Documents/data/uima_tests/drug%20name%20test%20v1.txt,
>> text="George Washington was aspirin exposed to butenafine hydrochloride
>> which is a more complex drug refer..."
>>
>>                 at
>> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:118)
>>
>>                 at
>> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>>
>>                 at
>> com.lucid.update.FieldMappingProcessor.processAdd(FieldMappingUpdateProcessorFactory.java:98)
>>
>>                 at
>> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
>>
>>                 at
>> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
>>
>>                 at
>> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
>>
>>                 at
>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>>
>>                 at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>>
>>                 at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>>
>>                 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816)
>>
>>                 at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:448)
>>
>>                 at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:269)
>>
>>                 at
>> com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
>>
>>                 at
>> com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
>>
>>                 at
>> com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
>>
>>                 at
>> com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
>>
>>                 at
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
>>
>>                 at
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
>>
>>                 at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
>>
>>                 at
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
>>
>>                 at
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
>>
>>                 at
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
>>
>>                 at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
>>
>>                 at
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
>>
>>                 at
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
>>
>>                 at
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
>>
>>                 at
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
>>
>>                 at
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
>>
>>                 at
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
>>
>>                 at org.eclipse.jetty.server.Server.handle(Server.java:351)
>>
>>                 at
>> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
>>
>>                 at
>> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
>>
>>                 at
>> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900)
>>
>>                 at
>> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:954)
>>
>>                 at
>> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:952)
>>
>>                 at
>> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
>>
>>                 at
>> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66)
>>
>>                 at
>> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:254)
>>
>>                 at
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:599)
>>
>>                 at
>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:534)
>>
>>                 at java.lang.Thread.run(Unknown Source)
>>
>> Caused by: java.lang.IllegalArgumentException: URI is not hierarchical
>>
>>                 at java.io.File.<init>(Unknown Source)
>>
>>                 at
>> org.apache.ctakes.core.resource.FileResourceImpl.load(FileResourceImpl.java:44)
>>
>>                 at
>> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:603)
>>
>>                 at
>> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:442)
>>
>>                 at
>> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:146)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:122)
>>
>>                 at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>>
>>                 at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:361)
>>
>>                 at
>> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
>>
>>                 at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
>>
>>                 at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>>
>>                 at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:267)
>>
>>                 at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:335)
>>
>>                 at
>> org.apache.lucene.analysis.uima.ae.BasicAEProvider.getAE(BasicAEProvider.java:73)
>>
>>                 at
>> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText(UIMAUpdateRequestProcessor.java:155)
>>
>>                 at
>> org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:80)
>>
>>                 ... 40 more
>>
>>
>>
>> From: Pei Chen [mailto:chenpei@apache.org]
>> Sent: Wednesday, August 28, 2013 11:43 AM
>> To: user@ctakes.apache.org
>> Subject: Re: Package for use with Solr
>>
>>
>>
>> One can download the binaries which has all of the jars and resources.  Of
>> if you build from source, you can run $mvn package which will generate a
>> convenience zip with all of the jars and transitive dependencies and
>> resources neatly packaged inside ctakes-dist/target/.
>>
>>
>>
>> Hope that helps-
>>
>> Pei
>>
>>
>>
>> On Tue, Aug 27, 2013 at 6:07 PM, Vogel, James <JVogel@activehealth.net>
>> wrote:
>>
>> I'm not yet planning on SolrCAS because I'm not familiar with it yet.  First
>> I just want to use the UIMAUpdateRequestProcessor to map the results from
>> specific fields into solr.  I'm currently stuck on how to create jar(s) that
>> contain all the things under the *-res folders to work around the following
>> error:
>>
>> Caused by: java.io.FileNotFoundException:
>> org\apache\ctakes\dependency\parser\models\lemmatizer\dictionary-1.3.1.jar
>> (The system cannot find the path specified)
>>
>>
>>
>> I'm not familiar with maven.  The jars I created via the maven assembly
>> command only contain the classes.  Is there a maven command to create a
>> ctakes jar(s) that contains all the components (classes, xmls, dictionaries,
>> etc.)?
>>
>>
>>
>> From: Chen, Pei [mailto:Pei.Chen@childrens.harvard.edu]
>> Sent: Tuesday, August 27, 2013 9:58 AM
>> To: user@ctakes.apache.org
>> Subject: RE: Package for use with Solr
>>
>>
>>
>> Hi James,
>>
>> I believe the process of deploying a primitive annotator should be the same
>> as an aggregate pipeline that contains a fixed flow...
>>
>> I presume you would like to append something like the SolrCAS cas consumer
>> at the end of the pipeline?
>>
>> --Pei
>>
>>
>>
>> From: Vogel, James [mailto:JVogel@activehealth.net]
>> Sent: Tuesday, August 27, 2013 7:09 AM
>> To: user@ctakes.apache.org
>> Subject: Package for use with Solr
>>
>>
>>
>> Any guidance on how to package all the components needed for the ctakes
>> clinical pipeline AggregatePlaintextUMLSProcessor so that it can be deployed
>> as part of another application?  I'd like to deploy it to be run by the
>> UIMAUpdateRequestProcessor as part of solr document indexing. I know how to
>> package a single UIMA annotator by just referencing the .xml file and
>> putting the jar on the class path for Solr.  I don't know how to do the same
>> for all of the components in the pipeline.
>>
>>
>>
>> ________________________________
>>
>> IMPORTANT WARNING: Information contained in this email is intended for the
>> use of the individual to whom it is addressed, and may contain information
>> that is privileged, confidential, and exempt from disclosure under
>> applicable law. If you are not the intended recipient, or the employee or
>> agent responsible for delivering the message to the intended recipient, you
>> are hereby notified that any dissemination, distribution, or copying of this
>> communication is STRICTLY FORBIDDEN. If you have received this communication
>> in error, please notify us immediately by return email and delete this
>> document. Thank you.
>>
>>
>>
>> ________________________________
>>
>> IMPORTANT WARNING: Information contained in this email is intended for the
>> use of the individual to whom it is addressed, and may contain information
>> that is privileged, confidential, and exempt from disclosure under
>> applicable law. If you are not the intended recipient, or the employee or
>> agent responsible for delivering the message to the intended recipient, you
>> are hereby notified that any dissemination, distribution, or copying of this
>> communication is STRICTLY FORBIDDEN. If you have received this communication
>> in error, please notify us immediately by return email and delete this
>> document. Thank you.
>>
>>
>>
>>
>>
>> ________________________________
>>
>> IMPORTANT WARNING: Information contained in this email is intended for the
>> use of the individual to whom it is addressed, and may contain information
>> that is privileged, confidential, and exempt from disclosure under
>> applicable law. If you are not the intended recipient, or the employee or
>> agent responsible for delivering the message to the intended recipient, you
>> are hereby notified that any dissemination, distribution, or copying of this
>> communication is STRICTLY FORBIDDEN. If you have received this communication
>> in error, please notify us immediately by return email and delete this
>> document. Thank you.
>>
>>
>>
>>
>>
>> ________________________________
>>
>> IMPORTANT WARNING: Information contained in this email is intended for the
>> use of the individual to whom it is addressed, and may contain information
>> that is privileged, confidential, and exempt from disclosure under
>> applicable law. If you are not the intended recipient, or the employee or
>> agent responsible for delivering the message to the intended recipient, you
>> are hereby notified that any dissemination, distribution, or copying of this
>> communication is STRICTLY FORBIDDEN. If you have received this communication
>> in error, please notify us immediately by return email and delete this
>> document. Thank you.
>>
>>
>>
>> ________________________________
>>
>> IMPORTANT WARNING: Information contained in this email is intended for the
>> use of the individual to whom it is addressed, and may contain information
>> that is privileged, confidential, and exempt from disclosure under
>> applicable law. If you are not the intended recipient, or the employee or
>> agent responsible for delivering the message to the intended recipient, you
>> are hereby notified that any dissemination, distribution, or copying of this
>> communication is STRICTLY FORBIDDEN. If you have received this communication
>> in error, please notify us immediately by return email and delete this
>> document. Thank you.
>>
>>
>> ________________________________
>> IMPORTANT WARNING: Information contained in this email is intended for the
>> use of the individual to whom it is addressed, and may contain information
>> that is privileged, confidential, and exempt from disclosure under
>> applicable law. If you are not the intended recipient, or the employee or
>> agent responsible for delivering the message to the intended recipient, you
>> are hereby notified that any dissemination, distribution, or copying of this
>> communication is STRICTLY FORBIDDEN. If you have received this communication
>> in error, please notify us immediately by return email and delete this
>> document. Thank you.

IMPORTANT WARNING: Information contained in this email is intended for the use of the individual
to whom it is addressed, and may contain information that is privileged, confidential, and
exempt from disclosure under applicable law. If you are not the intended recipient, or the
employee or agent responsible for delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution, or copying of this communication is
STRICTLY FORBIDDEN. If you have received this communication in error, please notify us immediately
by return email and delete this document. Thank you.

Mime
View raw message