ctakes-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vijay garla <vnga...@gmail.com>
Subject Re: RadLex as dictionary in cTakes
Date Thu, 06 Feb 2014 02:35:15 GMT
Hi Vlad,

Assuming you have UMLS installed in your DB with RXNORM and RADLEX (I
didn't know radlex was included in the umls), and have run the YTEX
install, you can very easily create new dictionary lookup tables.

to do so,
* delete the contents of v_snomed_fword_lookup (delete
from v_snomed_fword_lookup)
* modify this script to include whatever SABs (source vocabularies) you
like:
https://svn.apache.org/repos/asf/ctakes/branches/ytex/ctakes-ytex/scripts/data/mssql/umls/insert_view.sql
You will see the following line:
and mrc.sab in ( 'SNOMEDCT','RXNORM' )
Change this to include whatever source vocabularies you want, and re-run

-vj


On Wed, Feb 5, 2014 at 5:05 PM, <vlad.valtchinov@gmail.com> wrote:

> A follow-up question regarding custom UMLS dictionaries in cTakes.
>
> Thanks to all the people who took the time to post, below.
>
> Additionally, I wanted to request opinion on 2-3 things.
>
> 1. how does one prepare the current Snomed + Rxnorm UMLS ontologies for
> import in cTakes, as a database.
> 2. in these steps, does one need only the RRF files (in META dir), or also
> the files from the Semantic Network for the given UMLS subset, stored in
> the NET directory by MMorpho.
> 2. what are the steps one needs to follow to make cTakes 3.1.1 use
> a subset of UMLS pre-stored in the db.
>
> Any and all leads are highly appreciated.
>
> Best,
> vlad
>
>
> On Monday, January 6, 2014 9:06:23 AM UTC-5, vijay garla wrote:
>
>> Re 1)
>> I have augmented umls-derived dictionaries with custom dictionaries
>> (single dictionary + single dictionary lookup component).  One disadvantage
>> of using radlex in addition to ctakes is that overlapping concepts will be
>> mapped to both ctakes & radlex (duplicates).  Another disadvantage is that
>> Word Sense Disambiguation is not possible across UMLS & RADLEX (need the
>> concept relations).
>>
>> Re 2)
>> I can't speak to the 'preferred' way of importing this, but what I would
>> do is import radlex into a DB, and put my own dictionary lookup table
>> together from the UMLS and RADLEX.  The dictionary lookup table - be it in
>> lucene, db, or csv - has at a minimum the following columns:
>> * concept id (e.g. cui)
>> * tokenized string (string run through ctakes tokenizer, each token
>> delimited by a space char)
>> * first word of tokenized string
>> And optional
>> * semantic type (tui) of the concept
>>
>> Re 3)
>> You will have to create a dictionary out of this as discussed above
>>
>> Re 4)
>> I prefer to have a single dictionary and a single dictionary lookup
>> component for efficiency and to avoid duplicate annotation.
>>
>> One issue is that the ctakes dictionary lookup component is hardwired to
>> output a specific annotation type (EntityMention, DrugMentionAnnotation,
>> etc.)  If you don't need the extra annotations added by Drug NER and the
>> relation extractor (which decorates the AnatomicalSiteMention I believe),
>> then go with a single dictionary/single dictionary lookup component that
>> outputs an EntityMention for each annotation (this is the YTEX default
>> config).
>>
>> With ytex 0.8/ctakes 2.5 we created a dictionary lookup component that
>> figured out which entity type to output (EntityMention vs DrugMention)
>> based on the CUIs identified.  With ctakes 3.1 there are more types of
>> entities (AnatomicalSiteMention, .. and more?).  I have yet to create a
>> dictionary lookup component that dynamically determines which type of
>> entity to create based on the CUIs/TUIs contained.  The way I would imagine
>> doing this is as follows: In the dictionary lookup component, we could have
>> a map of semantic types to subclass of EntityMention.  If any of the TUIs
>> of the matched concepts are in this map, create the subclass of
>> EntityMention.  E.g. We have a single dictionary, and a single dictionary
>> lookup component, which when coming across 'hand' will find the CUI
>> C0018563 which has the TUI T023 'Body Part'.  T023 is mapped to the
>> AnatomicalSiteMention class and therefore the DictionaryLookup would create
>> an AnatomicalSiteMention annotation.
>>
>> If people thing this a good idea, I'll add a jira ticket for 'Smarter
>> DictonaryLookup'
>>
>> vj
>>
>>
>>
>> On Sat, Jan 4, 2014 at 4:03 PM, Vlad Valtchinov <vlad.va...@gmail.com>wrote:
>>
>>> Hello All-
>>>
>>>
>>>
>>> best wishes for a good 2014.
>>>
>>>
>>>
>>> I have a question about using a radiology-related ontology,
>>>
>>> RadLex, in cTakes. If somebody out there has imported and used in
>>>
>>> processing radiology reports we'd like to hear from them to share
>>>
>>> their experience.
>>>
>>>
>>>
>>> A couple of specific questions:
>>>
>>>
>>>
>>> 1.      what are the pros and cons of implementing RadLex as a custom
>>> dictionary, or as a part of UMLS-sanctioned ontology
>>>
>>> even though it is not yet part of UMLS but it is part of NCI's
>>> Metathesaurus
>>>
>>> 2.      what would be a the preferred way to import it in cTakes - via
>>> an RRF upload, or other custom format
>>>
>>> 3.      could one take a NCImThesaurus download from NCI and use it to
>>> import RadLex
>>>
>>> 4.      in a recent discussion regarding difference between custom
>>> dictionaries and UMLS imported ontologies
>>>
>>> "How to augment/modify UMLS resources?" it looks like cTakes would
>>> actually process the note once for the UMLS
>>>
>>> supplied resources and once for the (each?) custom dictionary, if i
>>> understand this correctly. Wouldn't it then be very
>>>
>>> inefficient to have multiple custom dictionaries, as opposed to try to
>>> maximize the UMLS ones? Any difference in
>>>
>>> the YTex behavior (VJ, please chime in)
>>>
>>>
>>>
>>> More generally, what are the pros and cons of using all of i.e. the UMLS
>>> Metathesaurus (or the NCImThesaurus) as
>>>
>>> ontology dictionaries in cTakes, apart from licensing issues?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> vlad
>>>
>>> Brigham rad
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "ytex-users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to ytex-users+...@googlegroups.com.
>>> To post to this group, send email to ytex-...@googlegroups.com.
>>>
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/ytex-users/030101cf0990%2477021700%2465064500%24%40gmail.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "ytex-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ytex-users+unsubscribe@googlegroups.com.
> To post to this group, send email to ytex-users@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/ytex-users/553c1e4e-afe3-4866-811d-60e447536410%40googlegroups.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

Mime
View raw message