ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "britt fitch (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CTAKES-368) Allow alternate CUI formats in fast dictionary lookup module
Date Thu, 09 Jul 2015 19:24:05 GMT
britt fitch created CTAKES-368:

             Summary: Allow alternate CUI formats in fast dictionary lookup module
                 Key: CTAKES-368
                 URL: https://issues.apache.org/jira/browse/CTAKES-368
             Project: cTAKES
          Issue Type: Improvement
          Components: ctakes-dictionary-lookup
    Affects Versions: 3.2.2
            Reporter: britt fitch
            Assignee: Sean Finan
             Fix For: 3.2.3

The current fast lookup using a BSV parses the first field as “C” and up to 7 numerals,
padding with “0" as needed to reach that length when applicable [see CuiCodeUtil.getCuiCode(String)]

The CUI string is then substring’d from 1 to len and parsed as a Long.

This is producing issues with other related, but separate, ontologies (MedGen) where the bulk
of concepts use UMLS CUIs but some additional concepts were created by the NCBI where no CUI
previously existed.
These MedGen-specific concepts are created with a prefix “CN” + 6 numerals, resulting
in “N123456” failing to produce a Long.

It is preferred to allow alternative CUI formats.

This message was sent by Atlassian JIRA

View raw message