ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Natalia Connolly <natalia.v.conno...@gmail.com>
Subject Re: Input file format for CPE?
Date Mon, 21 Jul 2014 18:43:05 GMT
Thanks Tim.  This worked in the sense that it did not crash; however, the
output does not seem to have any actual annotations of diagnoses,
medications, etc.  The input text contains a number of such concepts that
had indeed been flagged by CVD; but when I grep for "concept" or "medfacts"
or "cui" in the CPE output there is nothing there.  Would you have any
suggestions for how to "synchronize" the outputs of CVD and CPE?  Both
scripts contain the -Dctakes.umlsuser/umlspw options, so both should have
access to UMLS.

Thank you,


On Mon, Jul 21, 2014 at 1:36 PM, Miller, Timothy <
Timothy.Miller@childrens.harvard.edu> wrote:

>  It looks to me like you want test_plaintext.xml rather than test1.xml.
> test1.xml seems to expect CDA-formatted input while test_plaintext.xml can
> read text files like you have.
> Tim
> On 07/21/2014 01:30 PM, Natalia Connolly wrote:
> Hello,
>     I am new to cTAKES.  I am using cTAKES 3.1.  I've been able to run
> the visual debugger without any trouble but now I am stuck on running the
> CPE version, which is what I will really need as I have a large number of
> clinical documents to process.
>      I loaded test1.xml as the descriptor, and made sure both the input
> and the output directories exist.  My single input file in the input
> directory is just plain text, similar to the "Dr. Nutritious" example.
> However, I am getting the following error:
>  org.apache.uima.analysis_engine.AnalysisEngineProcessException
> CausedBy: org,xml.sax.SAXParseException; lineNumber: 1; columnNumber: 2;
> Content is now allowed in Prolog.
>     Does this mean that the input file has to be in xml format?  If so,
> how do I convert plain text into the format that cTAKES expects?
>     Thank you.
>     Natalia Connolly

View raw message