ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kean Kaufmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CTAKES-450) CDASegmentAnnotator misses all headings after empty segment
Date Mon, 10 Jul 2017 14:12:00 GMT

    [ https://issues.apache.org/jira/browse/CTAKES-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080366#comment-16080366
] 

Kean Kaufmann commented on CTAKES-450:
--------------------------------------

Thanks Sean! The RegexSectionizer certainly looks a lot better.  RecordsOne
isn't using the CDASegmentAnnotator anymore (CCDA doesn't seem to have
caught on, does it?) -- I just noticed this while I was setting up unit
tests, and thought I'd put it out there for anybody who might still want to
use it.






> CDASegmentAnnotator misses all headings after empty segment
> -----------------------------------------------------------
>
>                 Key: CTAKES-450
>                 URL: https://issues.apache.org/jira/browse/CTAKES-450
>             Project: cTAKES
>          Issue Type: Bug
>          Components: ctakes-core
>            Reporter: Kean Kaufmann
>         Attachments: CDASegmentAnnotator.diff
>
>
> If the CDASegmentAnnotator encounters an empty segment, it throws away everything after
that in the document.  You can see this in the test document provided for TestCDASegmentAnnotator.
The heading "CURRENT HEALTH STATUS" is followed immediately by the heading "Medications";
the test case misses the "Medications" heading, and "FAMILY HISTORY" after that. The sorted_segments
loop is only incrementing the index variable for non-empty segments.
> Patch attached.
> TestCDASegmentAnnotator output before fix (with getPreferredText()):
> Segment:2.16.840.1.113883.10.20.22.1.1 Begin:92 End:159: Header
> Segment:1.3.6.1.4.1.19376.1.5.3.1.1.13.2.1 Begin:176 End:1612: CHIEF COMPLAINT
> Segment:2.16.840.1.113883.10.20.22.2.20 Begin:1634 End:1696: HISTORY OF PAST ILLNESS
> Segment:2.16.840.1.113883.10.20.22.2.2.1 Begin:1711 End:2271: History of immunizations
> After fix:
> Segment:2.16.840.1.113883.10.20.22.1.1 Begin:92 End:159: Header
> Segment:1.3.6.1.4.1.19376.1.5.3.1.1.13.2.1 Begin:176 End:1612: CHIEF COMPLAINT
> Segment:2.16.840.1.113883.10.20.22.2.20 Begin:1634 End:1696: HISTORY OF PAST ILLNESS
> Segment:2.16.840.1.113883.10.20.22.2.2.1 Begin:1711 End:2271: History of immunizations
> Segment:2.16.840.1.113883.10.20.22.2.1.1 Begin:2307 End:3506: HISTORY OF MEDICATION USE
> Segment:2.16.840.1.113883.10.20.22.2.15 Begin:3522 End:5608: Family History



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message