ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maite Meseure Hugues <meseure.ma...@gmail.com>
Subject Re: Question about the pipeline
Date Tue, 03 Feb 2015 20:28:39 GMT
Oh yes my apologies, I mixed RunCPE that takes the cpe.xml and
BagofCuisGenerator that takes input and output directories in arguments.
Thanks for the pointer on CmlLineCpeRunner, I hadn't seen that.

On Tue, Feb 3, 2015 at 1:47 PM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Maite,
>
> RunCPE is a good find, and if it fits your bil hten you should use it.
> But it (if you mean the yTex class) doesn't take input and output
> directories from the command line.  It does take the path to a CPE.xml
> file.  There is a cTakes (non-yTex) equivalent named CmdLineCpeRunner.
> Either one of them should print a usage if you run it without arguments.
> As the CmdLineCpeRunner indicates, you can create a cpe .xml file with the
> cpe gui.  Basically, start the cpe gui, select your input (reader), output
> (writer) and pipeline (ae) in the gui and then save the cpe descriptor (via
> the menubar).  You can exit the gui and run either one of the cmd line
> utilities with the path to that cpe .xml descriptor as the argument.
> Please note: sometimes you have to explicitly type ".xml" in the filename
> when saving with the cpe gui.  If you run with the cpe gui and then exit it
> should automatically ask you if you want to save the cpe .xml descriptor.
> Anyway, once you have the .xml file you can always edit the input and
> output paths in that file to change your run parameters.
>
> Sean
>
> -----Original Message-----
> From: Maite Meseure Hugues [mailto:meseure.maite@gmail.com]
> Sent: Tuesday, February 03, 2015 9:01 AM
> To: dev@ctakes.apache.org
> Subject: Re: Question about the pipeline
>
> Thanks a lot Sean for your detailed reply. I've also found RunCPE.java
> that allows to put the input and outpur directories in arguments in the
> environment and do the same job than the CPE-GUI -at least in Eclipse, I
> haven't managed to run it via the command line yet.
>
> On Mon, Feb 2, 2015 at 7:12 PM, Finan, Sean <
> Sean.Finan@childrens.harvard.edu> wrote:
>
> > Hi Tol (and Maite),
> >
> > I'm not entirely certain that I understand the question, but here is
> > an attempt to help.  If I'm oversimplifying then I apologize.
> >
> > I think that ExampleAggregatePipeline is intended to represent a very
> > simple single-note pipeline and that custom code could be produced by
> > using it as an example.
> >
> > If you want to process texts in a directory, you can find with a web
> > search plenty of ways to list files in a directory and read text from
> > files.  org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader
> > might be what you used in the CPE, and you can certainly peruse the
> > code and take what you need.  Or, if you decide to write a simple diy,
> > here is one
> > possibility:
> >
> > Static public Collection<File> getFilesInDir( final File directory ) {
> >    final Collection<File> fileList = new ArrayList<>();
> >    final File[] fileList = directory.listFiles();
> >    if ( fileList == null ) {
> >       System.err.println( "please check the directory " +
> > directory.getAbsolutePath() );
> >       System.exit( 1 );
> >    }
> >     for ( final File file : directory.listFiles() ) {
> >         if ( file.canRead() ) {
> >             fileList.add( file );
> >         }
> >     }
> > }
> >
> > Static public String getTextInFile( final File file ) throws IOException
> > {   -- or handle ioE herein
> >    final Path nioPath = file.toPath();
> >    return new String( Files.readAllBytes( nioPath ) ); }
> >
> > Static public void main( String ... args ) {
> >    If ( args[0].isEmpty() ) {
> >       System.out.println( "Enter a directory path" );
> >       System.exit( 0 );
> >    }
> >    Final Collection<File> files = getFilesInDir( new File( args[0] );
> >    For ( File file : files ) {
> >       Final String note = getTextInFile( file );
> >       ---  Insert here code a' la ExampleAggregatePipeline  ---
> >       ---  swap out the writer in ExampleAggregatePipeline with
> > CasIOUtil method (below)  ---
> >    }
> > }
> >
> > I must admit that I have never directly used it, but there is an xmi
> > file writing method in org.apache.uima.fit.util.CasIOUtil named
> > writeXmi( JCas jCas, File file ).  You could give this a try and see
> > if it produces the type of output that you want.  The same utility
> > class has a writeXCas(..) method.
> >
> >
> > If the above has absolutely nothing to do with your needs then please
> > send me a bulleted list of items, example workflow, etc. and I'll see
> > if I can be of service.
> >
> > Oh, and I wrote the above code freehand, so MS Outlook is adding
> > capital letters, etc.  If you cut and paste you'll need to change that
> > - plus I haven't run/compiled, so there might be a typo or missed
> > exception or something.  Or it may not work (in which case I'll throw
> > in a little more effort).
> >
> > Sean
> >
> >
> > -----Original Message-----
> > From: Tol O. [mailto:toltox@gmail.com]
> > Sent: Monday, February 02, 2015 6:56 PM
> > To: dev@ctakes.apache.org
> > Subject: Re: Question about the pipeline
> >
> > Maite Meseure Hugues <meseure.maite@...> writes:
> >
> > >
> > > Hello all,
> > >
> > > Thank you for your preceding answers.
> > > I have a few questions regarding the pipeline example to run cTakes
> > > programmatically.
> > > I am running ExampleAggregatePipeline.java with
> > > ExampleHelloWorldAnnotator but I would like to know how I can change
> > > it to run my data, as the CPE where we can choose the directory of
> > > our
> > data.
> > > My second question is about the xml output generated with the CPE,
> > > can I get the same xml output in using the example pipeline? and How?
> > > Thanks for your time.
> >
> >
> > I would like to ask the same question. After successfully setting up
> > CTAKES following the Developers Guide I would also like to use a
> > modified ExampleAggregatePipeline to output a CAS file identical to
> > the output obtained by the CPE or the CVD when following the Users Guide.
> >
> > This would be a great help for developers as a starting class to be
> > able to programmatically obtain an annotated file based on a plaintext
> > or XML input, same as through the two GUIs.
> >
> > Right now I am reading through the Component Use Guide to replicate
> > the CPE or the CVD tutorial with the test input, but it is a bit
> overwhelming.
> >
> > Any pointers or suggestions would be really appreciated.
> >
> > Tol O.
> >
> >
>
>
> --
> --
>  Maïté Meseure Hugues
>



-- 
--
 Maïté Meseure Hugues

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message