ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: Question about the pipeline
Date Thu, 05 Feb 2015 20:24:51 GMT
Hi Maite,

If you can run the cpe gui using the script in bin/ , try specifying the descriptor for that:

runctakesCPE -desc pathToXml

If that runs then try copying the runctakesCPE to something like runctakesCLI and change the
last line of the file to call CmdLineCpeRunner instead of CpmFrame.

Sean

p.s. check the last line of runctakesCPE script that you are using and make sure that it passes
arguments: %* for Windows or $@ for unix/linux

-----Original Message-----
From: Maite Meseure Hugues [mailto:meseure.maite@gmail.com] 
Sent: Thursday, February 05, 2015 9:42 AM
To: dev@ctakes.apache.org
Subject: Re: Question about the pipeline

Yes, it does but only in Eclipse, not in command line even though I am in the good directory.
I have to look at the classpath more in details probably.
Thanks for your replies.

On Thu, Feb 5, 2015 at 8:08 AM, Finan, Sean < Sean.Finan@childrens.harvard.edu> wrote:

> Hi Maite,
>
> Without more information I can't venture a guess as to a cause of the 
> error.  If RunCPE works then why not use that?  They are practically 
> identical.
>
> Sean
> ________________________________________
> From: Maite Meseure Hugues [meseure.maite@gmail.com]
> Sent: Thursday, February 05, 2015 8:51 AM
> To: dev@ctakes.apache.org
> Subject: Re: Question about the pipeline
>
> I see. In my case, I am using the CPE descriptor saved from the GUI 
> for CmdLineCpeRunner as said Sean. I've selected 
> AggregatePlaintextProcessor.xml as AE but I have this error:
>
> "Couldn't initialize processing engine.
>
>   Initialization of CAS Processor with name "AggregatePlaintextProcessor"
> failed. "
>
> Meanwhile, RunCPE.java works properly with the same descriptor in Eclipse.
> Does anyone have an idea?
>
> On Wed, Feb 4, 2015 at 12:56 PM, Lingren, Todd 
> <Todd.Lingren@cchmc.org>
> wrote:
>
> > Hi Maite,
> > For each patient in my list, I create a new FilesToFiles CPE xml 
> > using some sed commands on the template original.
> >
> > Specifically, here's the command line argument (I'm on linux).
> >
> > CTAKES_HOME=...
> > java -cp 
> > $CTAKES_HOME/lib/*:$CTAKES_HOME/desc/:$CTAKES_HOME/resources/
> > -Dlog4j.configuration=file:$CTAKES_HOME/config/log4j.xml -Xms512M
> -Xmx2048M
> > CmdLineCpeRunner FilesToFiles_patient_cui.xml > outputfile.txt
> >
> > I don't think it matters, but I'm using the cTAKES 3.1.0 version.
> >
> >
> > Todd Lingren
> > Biomedical Informatics
> > Cincinnati Children’s Hospital
> > Todd.Lingren@cchmc.org
> > 513-803-9032
> >
> >
> > -----Original Message-----
> > From: Maite Meseure Hugues [mailto:meseure.maite@gmail.com]
> > Sent: Wednesday, February 04, 2015 12:59 PM
> > To: dev@ctakes.apache.org
> > Subject: Re: Question about the pipeline
> >
> > Interesting, Todd thank you and how do you use CMdLineCpeRunner
> basically?
> > Because I tested in cmd line with:
> >
> > java org.apache.ctakes.core.cpe.CmdLineCpeRunner 
> > [path-to-my-cpe.xml]
> >
> > but here is that I've got:
> >
> >
> > Exception in thread "main" java.lang.NoClassDefFoundError:
> > org/apache/uima/util/InvalidXMLException
> >
> > at java.lang.Class.getDeclaredMethods0(Native Method)
> >
> > at java.lang.Class.privateGetDeclaredMethods(Class.java:2693)
> >
> > at java.lang.Class.privateGetMethodRecursive(Class.java:3040)
> >
> > at java.lang.Class.getMethod0(Class.java:3010)
> >
> > at java.lang.Class.getMethod(Class.java:1776)
> >
> > at 
> > sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:54
> > 4)
> >
> > at 
> > sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526
> > )
> >
> > .......
> >
> > On Wed, Feb 4, 2015 at 8:32 AM, Lingren, Todd 
> > <Todd.Lingren@cchmc.org>
> > wrote:
> >
> > > Sean and Maite,
> > > FWIW, I use CmdLineCpeRunner frequently. I employ it with a bash 
> > > script to automatically create a new xml file based on the 
> > > subfolder names contained in the target directory. So in our HPC, 
> > > it spawns a new job for each subfolder (which may have between 5 and 2500 notes).
> > >
> > > Todd Lingren
> > > Biomedical Informatics
> > > Cincinnati Children’s Hospital
> > > Todd.Lingren@cchmc.org
> > > 513-803-9032
> > >
> > >
> > > -----Original Message-----
> > > From: Finan, Sean [mailto:Sean.Finan@childrens.harvard.edu]
> > > Sent: Tuesday, February 03, 2015 2:47 PM
> > > To: dev@ctakes.apache.org
> > > Subject: RE: Question about the pipeline
> > >
> > > Hi Maite,
> > >
> > > RunCPE is a good find, and if it fits your bil hten you should use it.
> > > But it (if you mean the yTex class) doesn't take input and output 
> > > directories from the command line.  It does take the path to a 
> > > CPE.xml file.  There is a cTakes (non-yTex) equivalent named CmdLineCpeRunner.
> > > Either one of them should print a usage if you run it without
> arguments.
> > > As the CmdLineCpeRunner indicates, you can create a cpe .xml file 
> > > with the cpe gui.  Basically, start the cpe gui, select your input 
> > > (reader), output
> > > (writer) and pipeline (ae) in the gui and then save the cpe 
> > > descriptor (via the menubar).  You can exit the gui and run either 
> > > one of the cmd line utilities with the path to that cpe .xml 
> > > descriptor as the
> argument.
> > > Please note: sometimes you have to explicitly type ".xml" in the 
> > > filename when saving with the cpe gui.  If you run with the cpe 
> > > gui and then exit it should automatically ask you if you want to 
> > > save the
> > cpe .xml descriptor.
> > > Anyway, once you have the .xml file you can always edit the input 
> > > and output paths in that file to change your run parameters.
> > >
> > > Sean
> > >
> > > -----Original Message-----
> > > From: Maite Meseure Hugues [mailto:meseure.maite@gmail.com]
> > > Sent: Tuesday, February 03, 2015 9:01 AM
> > > To: dev@ctakes.apache.org
> > > Subject: Re: Question about the pipeline
> > >
> > > Thanks a lot Sean for your detailed reply. I've also found 
> > > RunCPE.java that allows to put the input and outpur directories in 
> > > arguments in the environment and do the same job than the CPE-GUI 
> > > -at least in Eclipse, I haven't managed to run it via the command line yet.
> > >
> > > On Mon, Feb 2, 2015 at 7:12 PM, Finan, Sean < 
> > > Sean.Finan@childrens.harvard.edu> wrote:
> > >
> > > > Hi Tol (and Maite),
> > > >
> > > > I'm not entirely certain that I understand the question, but 
> > > > here is an attempt to help.  If I'm oversimplifying then I apologize.
> > > >
> > > > I think that ExampleAggregatePipeline is intended to represent a 
> > > > very simple single-note pipeline and that custom code could be 
> > > > produced by using it as an example.
> > > >
> > > > If you want to process texts in a directory, you can find with a 
> > > > web search plenty of ways to list files in a directory and read 
> > > > text from files.
> > > > org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader
> > > > might be what you used in the CPE, and you can certainly peruse 
> > > > the code and take what you need.  Or, if you decide to write a 
> > > > simple diy, here is one
> > > > possibility:
> > > >
> > > > Static public Collection<File> getFilesInDir( final File 
> > > > directory )
> {
> > > >    final Collection<File> fileList = new ArrayList<>();
> > > >    final File[] fileList = directory.listFiles();
> > > >    if ( fileList == null ) {
> > > >       System.err.println( "please check the directory " +
> > > > directory.getAbsolutePath() );
> > > >       System.exit( 1 );
> > > >    }
> > > >     for ( final File file : directory.listFiles() ) {
> > > >         if ( file.canRead() ) {
> > > >             fileList.add( file );
> > > >         }
> > > >     }
> > > > }
> > > >
> > > > Static public String getTextInFile( final File file ) throws
> > IOException
> > > > {   -- or handle ioE herein
> > > >    final Path nioPath = file.toPath();
> > > >    return new String( Files.readAllBytes( nioPath ) ); }
> > > >
> > > > Static public void main( String ... args ) {
> > > >    If ( args[0].isEmpty() ) {
> > > >       System.out.println( "Enter a directory path" );
> > > >       System.exit( 0 );
> > > >    }
> > > >    Final Collection<File> files = getFilesInDir( new File( args[0]
);
> > > >    For ( File file : files ) {
> > > >       Final String note = getTextInFile( file );
> > > >       ---  Insert here code a' la ExampleAggregatePipeline  ---
> > > >       ---  swap out the writer in ExampleAggregatePipeline with 
> > > > CasIOUtil method (below)  ---
> > > >    }
> > > > }
> > > >
> > > > I must admit that I have never directly used it, but there is an 
> > > > xmi file writing method in org.apache.uima.fit.util.CasIOUtil 
> > > > named writeXmi( JCas jCas, File file ).  You could give this a 
> > > > try and see if it produces the type of output that you want.  
> > > > The same utility class has a writeXCas(..) method.
> > > >
> > > >
> > > > If the above has absolutely nothing to do with your needs then 
> > > > please send me a bulleted list of items, example workflow, etc. 
> > > > and I'll see if I can be of service.
> > > >
> > > > Oh, and I wrote the above code freehand, so MS Outlook is adding 
> > > > capital letters, etc.  If you cut and paste you'll need to 
> > > > change that
> > > > - plus I haven't run/compiled, so there might be a typo or 
> > > > missed exception or something.  Or it may not work (in which 
> > > > case I'll throw in a little more effort).
> > > >
> > > > Sean
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Tol O. [mailto:toltox@gmail.com]
> > > > Sent: Monday, February 02, 2015 6:56 PM
> > > > To: dev@ctakes.apache.org
> > > > Subject: Re: Question about the pipeline
> > > >
> > > > Maite Meseure Hugues <meseure.maite@...> writes:
> > > >
> > > > >
> > > > > Hello all,
> > > > >
> > > > > Thank you for your preceding answers.
> > > > > I have a few questions regarding the pipeline example to run 
> > > > > cTakes programmatically.
> > > > > I am running ExampleAggregatePipeline.java with 
> > > > > ExampleHelloWorldAnnotator but I would like to know how I can 
> > > > > change it to run my data, as the CPE where we can choose the 
> > > > > directory of our
> > > > data.
> > > > > My second question is about the xml output generated with the 
> > > > > CPE, can I get the same xml output in using the example 
> > > > > pipeline? and
> How?
> > > > > Thanks for your time.
> > > >
> > > >
> > > > I would like to ask the same question. After successfully 
> > > > setting up CTAKES following the Developers Guide I would also 
> > > > like to use a modified ExampleAggregatePipeline to output a CAS 
> > > > file identical to the output obtained by the CPE or the CVD when 
> > > > following the Users
> > Guide.
> > > >
> > > > This would be a great help for developers as a starting class to 
> > > > be able to programmatically obtain an annotated file based on a 
> > > > plaintext or XML input, same as through the two GUIs.
> > > >
> > > > Right now I am reading through the Component Use Guide to 
> > > > replicate the CPE or the CVD tutorial with the test input, but 
> > > > it is a bit
> > > overwhelming.
> > > >
> > > > Any pointers or suggestions would be really appreciated.
> > > >
> > > > Tol O.
> > > >
> > > >
> > >
> > >
> > > --
> > > --
> > >  Maïté Meseure Hugues
> > >
> >
> >
> >
> > --
> > --
> >  Maïté Meseure Hugues
> >
>
>
>
> --
> --
>  Maïté Meseure Hugues
>



--
--
 Maïté Meseure Hugues
Mime
View raw message