ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject RE: docs on running Clinical Document Pipeline from Java?
Date Fri, 21 Jun 2013 17:55:12 GMT
Hi,
One can include the ctakes code as a maven dependency, however there is a current limitation-
in order to run the pipeline, it essentially needs the /desc and /resources unpacked somewhere
on disk.  There is an effort to streamline the resource loading so that it will make it easier
to integrate the modules.
Until then, one will need to essentially perform an "mvn package -DskipTests" to package everything
into a single package into ctakes-distribution/target.
Maybe there are other ways...
--Pei

> -----Original Message-----
> From: Sandy Ryza [mailto:sandy.ryza@cloudera.com]
> Sent: Thursday, June 20, 2013 5:24 PM
> To: dev@ctakes.apache.org
> Subject: Re: docs on running Clinical Document Pipeline from Java?
> 
> Thanks for the help!  Is there any advice on the best way to include ctakes as
> a dependency?  I've tried writing some code that points to
> AggregatePlaintextUMLSProcessor.xml, but it doesn't know where to find
> the other files that are referred to.  Is there any good way to package ctakes
> up and refer to a unit?  We want to be able to distribute something that
> relies on ctakes in a cluster.
> 
> (Here's the error I'm getting)
> Import failed.  Could not read from URL
> file:/home/sandy/ctakes-dependency-
> parser/desc/analysis_engine/ClearParserDependencyParserAE.xml.
> (Descriptor:
> file:/home/sandy/datascience/Mayo_cTAKES/mr/AggregatePlaintextUMLSP
> rocessor.xml)
> 
> -Sandy
> 
> 
> On Wed, Jun 19, 2013 at 2:30 PM, Andy McMurry
> <mcmurry.andy@gmail.com>wrote:
> 
> > Note: The WEKA gui reports the command line arguments for any GUI task.
> > It could be a very helpful timesaver if cTAKES had a similar feature.
> >
> > Otherwise, I fear we will be writing Main methods and docs for each
> > and every cTAKES task.
> > What do you all think?
> >
> > -------
> >
> > Real world example of how this works in Weka.
> > Say you wanted to run Adaboost on a C4.5 decision tree with cost
> > sensitive classification.
> > Weka reports the arguments, which I can re-run from command line
> >
> > Classifier csc = new CostSensitiveClassifier();
> >
> >         String[] adaboost = {
> >                 "-cost-matrix", costMatrix,
> >                 "-S", "1",
> >                 "-W", "weka.classifiers.meta.AdaBoostM1",
> >                 "--",
> >                 "-P", "100",
> >                 "-S", "1",
> >                 "-I", "30",
> >                 //
> >                 "-W", "weka.classifiers.trees.J48",
> >                 "--",
> >                 "-C", String.valueOf(j48Confidence),
> >                 "-M", String.valueOf(j48MinObjects)
> >         };
> >
> > csc.setOptions(adaboost);
> >
> >
> >
> >
> >
> >
> >
> >
> > On Jun 19, 2013, at 5:20 PM, "Chen, Pei"
> > <Pei.Chen@childrens.harvard.edu>
> > wrote:
> >
> > > Also,
> > > Tim recently just checked in a Main class that essentially could be
> > > the
> > beginnings of a Driver program.
> > > Check the main() out at:
> > >
> > http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-clinical-pipeline/
> > src/main/java/org/apache/ctakes/clinicalpipeline/runtime/BagOfCUIsGene
> > rator.java
> > >
> > > --Pei
> > >
> > >
> > >> -----Original Message-----
> > >> From: Girivaraprasad Nambari [mailto:girinambari@gmail.com]
> > >> Sent: Wednesday, June 19, 2013 3:47 PM
> > >> To: dev@ctakes.apache.org
> > >> Subject: Re: docs on running Clinical Document Pipeline from Java?
> > >>
> > >> Hi,
> > >>
> > >> Welcome to ctakes.
> > >>
> > >> There was a similar discussion initiated by me few months ago (you
> > >> may
> > be
> > >> able to find out if you browse through old discussions) . Here is
> > response
> > >> form Pei Chen & ctakes community:
> > >>
> > >> It is not quite prime time ready but, take a look peek at the below
> > >> (It
> > uses
> > >> uimaFIT to do the above):
> > >>
> > >> **
> > >>
> > >> http://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-
> > >>
> > gui/src/main/java/org/chboston/cnlp/ctakes/gui/service/LauncherService
> > .ja
> > >> va
> > >> ****
> > >>
> > >> ** Essentially, it boils down to a few lines of code:
> > >>
> > >> AnalysisEngine aggregateAE =
> > >> AnalysisEngineFactory.createAggregate(****
> > >>
> > >>               engines, componentNames, typeSystemDescription,
> > >> null,****
> > >>
> > >>               new SofaMapping[0]);****
> > >>
> > >>              ****
> > >>
> > >> JCas jcas = aggregateAE.newJCas();****
> > >>
> > >> jcas.setDocumentText(doc.getText());****
> > >>
> > >> aggregateAE.process(jcas);
> > >>
> > >>
> > >> We need to start from UIMA and UIMAfit to get some basic
> > >> understanding, then using ctakes component will be easy.
> > >>
> > >> Good luck!
> > >>
> > >> Thank you,
> > >>
> > >> Giri
> > >>
> > >>
> > >> On Wed, Jun 19, 2013 at 3:17 PM, Sandy Ryza
> > >> <sandy.ryza@cloudera.com>
> > >> wrote:
> > >>
> > >>> Hi cTAKES folks,
> > >>>
> > >>> I am trying to figure out how to run the Clinical Document
> > >>> Pipeline from Java.  All the documentation I have found so far has
> > >>> been about how to do this through a GUI.  Is there anything on how
> > >>> to run the pipeline programmatically?
> > >>>
> > >>> thanks for any help!
> > >>> Sandy
> > >>>
> >
> >

Mime
View raw message