incubator-ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Joseph Masanz (JIRA)" <>
Subject [jira] [Created] (CTAKES-145) inconsistent handling of upper ascii
Date Tue, 05 Feb 2013 16:36:12 GMT
James Joseph Masanz created CTAKES-145:

             Summary: inconsistent handling of upper ascii 
                 Key: CTAKES-145
             Project: cTAKES
          Issue Type: Task
          Components: ctakes-preprocessor
    Affects Versions: future enhancement
            Reporter: James Joseph Masanz
            Priority: Minor

Currently cTAKES handles character above ascii 127 different depending on if using a pipeline
that processes CDA (Clinical document architecture XML) or a pipeline that expects plain text.

The CDA pipelines, as an early step, create a plaintext view that has each upper ascii characters
replaced by a blank.

The plaintext pipelines do not do anything special for upper ascii characters.

Example input text for plaintext, to show this behavior: 
His name is Gërman. Temp is 98 °C taken on the forehead

Need to decide if it is OK for this inconsistent behavior or if we should change one or the
other to make them consistent.


This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message