Return-Path: X-Original-To: apmail-ctakes-user-archive@www.apache.org Delivered-To: apmail-ctakes-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D8FE119F4 for ; Mon, 21 Jul 2014 17:30:33 +0000 (UTC) Received: (qmail 55007 invoked by uid 500); 21 Jul 2014 17:30:33 -0000 Delivered-To: apmail-ctakes-user-archive@ctakes.apache.org Received: (qmail 54973 invoked by uid 500); 21 Jul 2014 17:30:33 -0000 Mailing-List: contact user-help@ctakes.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@ctakes.apache.org Delivered-To: mailing list user@ctakes.apache.org Received: (qmail 54961 invoked by uid 99); 21 Jul 2014 17:30:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Jul 2014 17:30:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of natalia.v.connolly@gmail.com designates 209.85.220.180 as permitted sender) Received: from [209.85.220.180] (HELO mail-vc0-f180.google.com) (209.85.220.180) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Jul 2014 17:30:29 +0000 Received: by mail-vc0-f180.google.com with SMTP id ij19so12611400vcb.11 for ; Mon, 21 Jul 2014 10:29:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=m3ULdytSEaSo93gZRzhMJAJ6bbNKTSqjOXWyX3rucR8=; b=wC3h/v+NmPovMl79/VNt/Z4JvlC3IAhasI9OSH8EdQkNZctRMMDwhGEWAGKmDLwE0Y n99BBFpkjwXxnqHJw0xtfSe/QuHIZEsJnnzc+oGiAQyrCGMu7OieaONZGdpZdp3U3wlj bxHeqJ81UkKUMsZuVwcHFk5Md/3kLlnv3wNwP5bGhcu5ZV8MwCZLSngeMfsZ4nF/YweO f/R/6DAw3YDMIYM7qnmxVICGiNC53Jz6k4+F8I3NqL5dv8IyugpPWDdoJ9dP/uy0pmCj SYNAOs/p2f3q5EqugEJgJ5YmY6UUetGF/lfj/hr2m+qqLFS4Ey+JZi5xtDIkZNlZwgmr K2eQ== MIME-Version: 1.0 X-Received: by 10.220.173.134 with SMTP id p6mr14228634vcz.36.1405963798057; Mon, 21 Jul 2014 10:29:58 -0700 (PDT) Received: by 10.58.247.3 with HTTP; Mon, 21 Jul 2014 10:29:58 -0700 (PDT) Date: Mon, 21 Jul 2014 13:29:58 -0400 Message-ID: Subject: Input file format for CPE? From: Natalia Connolly To: user@ctakes.apache.org Content-Type: multipart/alternative; boundary=089e0158b154c0165104feb774bb X-Virus-Checked: Checked by ClamAV on apache.org --089e0158b154c0165104feb774bb Content-Type: text/plain; charset=UTF-8 Hello, I am new to cTAKES. I am using cTAKES 3.1. I've been able to run the visual debugger without any trouble but now I am stuck on running the CPE version, which is what I will really need as I have a large number of clinical documents to process. I loaded test1.xml as the descriptor, and made sure both the input and the output directories exist. My single input file in the input directory is just plain text, similar to the "Dr. Nutritious" example. However, I am getting the following error: org.apache.uima.analysis_engine.AnalysisEngineProcessException CausedBy: org,xml.sax.SAXParseException; lineNumber: 1; columnNumber: 2; Content is now allowed in Prolog. Does this mean that the input file has to be in xml format? If so, how do I convert plain text into the format that cTAKES expects? Thank you. Natalia Connolly --089e0158b154c0165104feb774bb Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hello,

=C2=A0 =C2=A0I am new to cTAKES.= =C2=A0I am using cTAKES 3.1. =C2=A0I've been able to run the visual de= bugger without any trouble but now I am stuck on running the CPE version, w= hich is what I will really need as I have a large number of clinical docume= nts to process.

=C2=A0 =C2=A0 I loaded test1.xml as the descriptor, and= made sure both the input and the output directories exist. =C2=A0My single= input file in the input directory is just plain text, similar to the "= ;Dr. Nutritious" example. =C2=A0 However, I am getting the following e= rror:

org.apache.uima.analysis_engine.AnalysisEngineProcessEx= ception
CausedBy: org,xml.sax.SAXParseException; lineNumber: 1; c= olumnNumber: 2; Content is now allowed in Prolog.

=C2=A0 =C2=A0Does this mean that the input file has to be in xml format? = =C2=A0If so, how do I convert plain text into the format that cTAKES expect= s?

=C2=A0 =C2=A0Thank you.

=C2=A0 =C2=A0Natalia Connolly


--089e0158b154c0165104feb774bb--