ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From britt fitch <britt.fi...@wiredinformatics.com>
Subject dictionary-look-fast fails to handle alternative CUIs
Date Wed, 08 Jul 2015 18:22:19 GMT
This is largely directed to Sean but open to other feedback as well.

The current fast lookup using a BSV parses the first field as “C” and up to 7 numerals,
padding with “0" as needed to reach that length when applicable [see CuiCodeUtil.getCuiCode(String)]

The CUI string is then substring’d from 1 to len and parsed as a Long.

This is producing issues with other related, but separate, ontologies (MedGen) where the bulk
of concepts use UMLS CUIs but some additional concepts were created by the NCBI where no CUI
previously existed.
These MedGen-specific concepts are created with a prefix “CN” + 6 numerals, resulting
in “N123456” failing to produce a Long.

I wanted Sean’s thoughts on this and to get some feedback on if others are running into
this issue and if the community wants a solution to providing a CUI format beyond the standard
C + 7 numerals.

I’m happy to make these edits and check them in whether that means updating the CuiCodeUtil
class or creating an entirely new BSVConceptFactory if thats what makes the most sense.


Britt Fitch
Wired Informatics
265 Franklin St Ste 1702
Boston, MA 02110

View raw message