ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: Question about the pipeline
Date Tue, 03 Feb 2015 19:47:00 GMT
Hi Maite,

RunCPE is a good find, and if it fits your bil hten you should use it.  But it (if you mean
the yTex class) doesn't take input and output directories from the command line.  It does
take the path to a CPE.xml file.  There is a cTakes (non-yTex) equivalent named CmdLineCpeRunner.
 Either one of them should print a usage if you run it without arguments.  As the CmdLineCpeRunner
indicates, you can create a cpe .xml file with the cpe gui.  Basically, start the cpe gui,
select your input (reader), output (writer) and pipeline (ae) in the gui and then save the
cpe descriptor (via the menubar).  You can exit the gui and run either one of the cmd line
utilities with the path to that cpe .xml descriptor as the argument.  Please note: sometimes
you have to explicitly type ".xml" in the filename when saving with the cpe gui.  If you run
with the cpe gui and then exit it should automatically ask you if you want to save the cpe
.xml descriptor.  Anyway, once you have the .xml file you can always edit the input and output
paths in that file to change your run parameters.  

Sean

-----Original Message-----
From: Maite Meseure Hugues [mailto:meseure.maite@gmail.com] 
Sent: Tuesday, February 03, 2015 9:01 AM
To: dev@ctakes.apache.org
Subject: Re: Question about the pipeline

Thanks a lot Sean for your detailed reply. I've also found RunCPE.java that allows to put
the input and outpur directories in arguments in the environment and do the same job than
the CPE-GUI -at least in Eclipse, I haven't managed to run it via the command line yet.

On Mon, Feb 2, 2015 at 7:12 PM, Finan, Sean < Sean.Finan@childrens.harvard.edu> wrote:

> Hi Tol (and Maite),
>
> I'm not entirely certain that I understand the question, but here is 
> an attempt to help.  If I'm oversimplifying then I apologize.
>
> I think that ExampleAggregatePipeline is intended to represent a very 
> simple single-note pipeline and that custom code could be produced by 
> using it as an example.
>
> If you want to process texts in a directory, you can find with a web 
> search plenty of ways to list files in a directory and read text from 
> files.  org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader 
> might be what you used in the CPE, and you can certainly peruse the 
> code and take what you need.  Or, if you decide to write a simple diy,  
> here is one
> possibility:
>
> Static public Collection<File> getFilesInDir( final File directory ) {
>    final Collection<File> fileList = new ArrayList<>();
>    final File[] fileList = directory.listFiles();
>    if ( fileList == null ) {
>       System.err.println( "please check the directory " +
> directory.getAbsolutePath() );
>       System.exit( 1 );
>    }
>     for ( final File file : directory.listFiles() ) {
>         if ( file.canRead() ) {
>             fileList.add( file );
>         }
>     }
> }
>
> Static public String getTextInFile( final File file ) throws IOException
> {   -- or handle ioE herein
>    final Path nioPath = file.toPath();
>    return new String( Files.readAllBytes( nioPath ) ); }
>
> Static public void main( String ... args ) {
>    If ( args[0].isEmpty() ) {
>       System.out.println( "Enter a directory path" );
>       System.exit( 0 );
>    }
>    Final Collection<File> files = getFilesInDir( new File( args[0] );
>    For ( File file : files ) {
>       Final String note = getTextInFile( file );
>       ---  Insert here code a' la ExampleAggregatePipeline  ---
>       ---  swap out the writer in ExampleAggregatePipeline with 
> CasIOUtil method (below)  ---
>    }
> }
>
> I must admit that I have never directly used it, but there is an xmi 
> file writing method in org.apache.uima.fit.util.CasIOUtil named 
> writeXmi( JCas jCas, File file ).  You could give this a try and see 
> if it produces the type of output that you want.  The same utility 
> class has a writeXCas(..) method.
>
>
> If the above has absolutely nothing to do with your needs then please 
> send me a bulleted list of items, example workflow, etc. and I'll see 
> if I can be of service.
>
> Oh, and I wrote the above code freehand, so MS Outlook is adding 
> capital letters, etc.  If you cut and paste you'll need to change that 
> - plus I haven't run/compiled, so there might be a typo or missed 
> exception or something.  Or it may not work (in which case I'll throw 
> in a little more effort).
>
> Sean
>
>
> -----Original Message-----
> From: Tol O. [mailto:toltox@gmail.com]
> Sent: Monday, February 02, 2015 6:56 PM
> To: dev@ctakes.apache.org
> Subject: Re: Question about the pipeline
>
> Maite Meseure Hugues <meseure.maite@...> writes:
>
> >
> > Hello all,
> >
> > Thank you for your preceding answers.
> > I have a few questions regarding the pipeline example to run cTakes 
> > programmatically.
> > I am running ExampleAggregatePipeline.java with 
> > ExampleHelloWorldAnnotator but I would like to know how I can change 
> > it to run my data, as the CPE where we can choose the directory of 
> > our
> data.
> > My second question is about the xml output generated with the CPE, 
> > can I get the same xml output in using the example pipeline? and How?
> > Thanks for your time.
>
>
> I would like to ask the same question. After successfully setting up 
> CTAKES following the Developers Guide I would also like to use a 
> modified ExampleAggregatePipeline to output a CAS file identical to 
> the output obtained by the CPE or the CVD when following the Users Guide.
>
> This would be a great help for developers as a starting class to be 
> able to programmatically obtain an annotated file based on a plaintext 
> or XML input, same as through the two GUIs.
>
> Right now I am reading through the Component Use Guide to replicate 
> the CPE or the CVD tutorial with the test input, but it is a bit overwhelming.
>
> Any pointers or suggestions would be really appreciated.
>
> Tol O.
>
>


--
--
 Maïté Meseure Hugues
Mime
View raw message