ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maite Meseure Hugues <meseure.ma...@gmail.com>
Subject Re: Question about the pipeline
Date Tue, 03 Feb 2015 14:00:05 GMT
Thanks a lot Sean for your detailed reply. I've also found RunCPE.java that
allows to put the input and outpur directories in arguments in the
environment and do the same job than the CPE-GUI -at least in Eclipse, I
haven't managed to run it via the command line yet.

On Mon, Feb 2, 2015 at 7:12 PM, Finan, Sean <
Sean.Finan@childrens.harvard.edu> wrote:

> Hi Tol (and Maite),
>
> I'm not entirely certain that I understand the question, but here is an
> attempt to help.  If I'm oversimplifying then I apologize.
>
> I think that ExampleAggregatePipeline is intended to represent a very
> simple single-note pipeline and that custom code could be produced by using
> it as an example.
>
> If you want to process texts in a directory, you can find with a web
> search plenty of ways to list files in a directory and read text from
> files.  org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader might be
> what you used in the CPE, and you can certainly peruse the code and take
> what you need.  Or, if you decide to write a simple diy,  here is one
> possibility:
>
> Static public Collection<File> getFilesInDir( final File directory ) {
>    final Collection<File> fileList = new ArrayList<>();
>    final File[] fileList = directory.listFiles();
>    if ( fileList == null ) {
>       System.err.println( "please check the directory " +
> directory.getAbsolutePath() );
>       System.exit( 1 );
>    }
>     for ( final File file : directory.listFiles() ) {
>         if ( file.canRead() ) {
>             fileList.add( file );
>         }
>     }
> }
>
> Static public String getTextInFile( final File file ) throws IOException
> {   -- or handle ioE herein
>    final Path nioPath = file.toPath();
>    return new String( Files.readAllBytes( nioPath ) );
> }
>
> Static public void main( String ... args ) {
>    If ( args[0].isEmpty() ) {
>       System.out.println( "Enter a directory path" );
>       System.exit( 0 );
>    }
>    Final Collection<File> files = getFilesInDir( new File( args[0] );
>    For ( File file : files ) {
>       Final String note = getTextInFile( file );
>       ---  Insert here code a' la ExampleAggregatePipeline  ---
>       ---  swap out the writer in ExampleAggregatePipeline with CasIOUtil
> method (below)  ---
>    }
> }
>
> I must admit that I have never directly used it, but there is an xmi file
> writing method in org.apache.uima.fit.util.CasIOUtil named writeXmi( JCas
> jCas, File file ).  You could give this a try and see if it produces the
> type of output that you want.  The same utility class has a writeXCas(..)
> method.
>
>
> If the above has absolutely nothing to do with your needs then please send
> me a bulleted list of items, example workflow, etc. and I'll see if I can
> be of service.
>
> Oh, and I wrote the above code freehand, so MS Outlook is adding capital
> letters, etc.  If you cut and paste you'll need to change that - plus I
> haven't run/compiled, so there might be a typo or missed exception or
> something.  Or it may not work (in which case I'll throw in a little more
> effort).
>
> Sean
>
>
> -----Original Message-----
> From: Tol O. [mailto:toltox@gmail.com]
> Sent: Monday, February 02, 2015 6:56 PM
> To: dev@ctakes.apache.org
> Subject: Re: Question about the pipeline
>
> Maite Meseure Hugues <meseure.maite@...> writes:
>
> >
> > Hello all,
> >
> > Thank you for your preceding answers.
> > I have a few questions regarding the pipeline example to run cTakes
> > programmatically.
> > I am running ExampleAggregatePipeline.java with
> > ExampleHelloWorldAnnotator but I would like to know how I can change
> > it to run my data, as the CPE where we can choose the directory of our
> data.
> > My second question is about the xml output generated with the CPE, can
> > I get the same xml output in using the example pipeline? and How?
> > Thanks for your time.
>
>
> I would like to ask the same question. After successfully setting up
> CTAKES following the Developers Guide I would also like to use a modified
> ExampleAggregatePipeline to output a CAS file identical to the output
> obtained by the CPE or the CVD when following the Users Guide.
>
> This would be a great help for developers as a starting class to be able
> to programmatically obtain an annotated file based on a plaintext or XML
> input, same as through the two GUIs.
>
> Right now I am reading through the Component Use Guide to replicate the
> CPE or the CVD tutorial with the test input, but it is a bit overwhelming.
>
> Any pointers or suggestions would be really appreciated.
>
> Tol O.
>
>


-- 
--
 Maïté Meseure Hugues

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message