ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vijay garla <vnga...@gmail.com>
Subject Re: RadLex as dictionary in cTakes
Date Mon, 06 Jan 2014 14:06:23 GMT
Re 1)
I have augmented umls-derived dictionaries with custom dictionaries (single
dictionary + single dictionary lookup component).  One disadvantage of
using radlex in addition to ctakes is that overlapping concepts will be
mapped to both ctakes & radlex (duplicates).  Another disadvantage is that
Word Sense Disambiguation is not possible across UMLS & RADLEX (need the
concept relations).

Re 2)
I can't speak to the 'preferred' way of importing this, but what I would do
is import radlex into a DB, and put my own dictionary lookup table together
from the UMLS and RADLEX.  The dictionary lookup table - be it in lucene,
db, or csv - has at a minimum the following columns:
* concept id (e.g. cui)
* tokenized string (string run through ctakes tokenizer, each token
delimited by a space char)
* first word of tokenized string
And optional
* semantic type (tui) of the concept

Re 3)
You will have to create a dictionary out of this as discussed above

Re 4)
I prefer to have a single dictionary and a single dictionary lookup
component for efficiency and to avoid duplicate annotation.

One issue is that the ctakes dictionary lookup component is hardwired to
output a specific annotation type (EntityMention, DrugMentionAnnotation,
etc.)  If you don't need the extra annotations added by Drug NER and the
relation extractor (which decorates the AnatomicalSiteMention I believe),
then go with a single dictionary/single dictionary lookup component that
outputs an EntityMention for each annotation (this is the YTEX default

With ytex 0.8/ctakes 2.5 we created a dictionary lookup component that
figured out which entity type to output (EntityMention vs DrugMention)
based on the CUIs identified.  With ctakes 3.1 there are more types of
entities (AnatomicalSiteMention, .. and more?).  I have yet to create a
dictionary lookup component that dynamically determines which type of
entity to create based on the CUIs/TUIs contained.  The way I would imagine
doing this is as follows: In the dictionary lookup component, we could have
a map of semantic types to subclass of EntityMention.  If any of the TUIs
of the matched concepts are in this map, create the subclass of
EntityMention.  E.g. We have a single dictionary, and a single dictionary
lookup component, which when coming across 'hand' will find the CUI
C0018563 which has the TUI T023 'Body Part'.  T023 is mapped to the
AnatomicalSiteMention class and therefore the DictionaryLookup would create
an AnatomicalSiteMention annotation.

If people thing this a good idea, I'll add a jira ticket for 'Smarter


On Sat, Jan 4, 2014 at 4:03 PM, Vlad Valtchinov

> Hello All-
> best wishes for a good 2014.
> I have a question about using a radiology-related ontology,
> RadLex, in cTakes. If somebody out there has imported and used in
> processing radiology reports we’d like to hear from them to share
> their experience.
> A couple of specific questions:
> 1.      what are the pros and cons of implementing RadLex as a custom
> dictionary, or as a part of UMLS-sanctioned ontology
> even though it is not yet part of UMLS but it is part of NCI’s
> Metathesaurus
> 2.      what would be a the preferred way to import it in cTakes – via an
> RRF upload, or other custom format
> 3.      could one take a NCImThesaurus download from NCI and use it to
> import RadLex
> 4.      in a recent discussion regarding difference between custom
> dictionaries and UMLS imported ontologies
> “How to augment/modify UMLS resources?” it looks like cTakes would
> actually process the note once for the UMLS
> supplied resources and once for the (each?) custom dictionary, if i
> understand this correctly. Wouldn’t it then be very
> inefficient to have multiple custom dictionaries, as opposed to try to
> maximize the UMLS ones? Any difference in
> the YTex behavior (VJ, please chime in)
> More generally, what are the pros and cons of using all of i.e. the UMLS
> Metathesaurus (or the NCImThesaurus) as
> ontology dictionaries in cTakes, apart from licensing issues?
> Thanks,
> vlad
> Brigham rad
> --
> You received this message because you are subscribed to the Google Groups
> "ytex-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ytex-users+unsubscribe@googlegroups.com.
> To post to this group, send email to ytex-users@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ytex-users/030101cf0990%2477021700%2465064500%24%40gmail.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.

View raw message