ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Finan (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CTAKES-389) cTAKES dictionary lookup missed word starting string bug
Date Mon, 16 Nov 2015 17:25:10 GMT

     [ https://issues.apache.org/jira/browse/CTAKES-389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Finan resolved CTAKES-389.
-------------------------------
       Resolution: Fixed
         Assignee: Sean Finan
    Fix Version/s: 3.2.3

Switched from char by char match ending on token count to string by string match.  Now depends
upon java string match speed, but fixes problem without adding more checks to code.

> cTAKES dictionary lookup missed word starting string bug
> --------------------------------------------------------
>
>                 Key: CTAKES-389
>                 URL: https://issues.apache.org/jira/browse/CTAKES-389
>             Project: cTAKES
>          Issue Type: Bug
>          Components: ctakes-dictionary-lookup-fast
>    Affects Versions: 3.2.2, 3.2.3
>         Environment: All environments
>            Reporter: Tomasz Oliwa
>            Assignee: Sean Finan
>             Fix For: 3.2.3
>
>
> cTAKES has a bug in its fast dictionary lookup.
> "baby to" , "baby too" gets looked up as C1305907 of "baby tooth", however "baby token"
does not match it.
> "electrolyte le", "electrolyte lev" gets found as C0428284 "electrolyte level", but "electrolyte
dev" does not match.
> It seems if the "missed" word contains the same characters that the word found in the
fast dictionary starts with, a match is made.
> This is a bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message