ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: LVG documentation
Date Mon, 15 Feb 2016 22:17:48 GMT
Hi Jessica,

You have it correct - LVG will add variants that the dictionary lookup will use in an attempt
to discover terms not explicitly in the dictionary database - such as the plurals that you
saw.  However, it does not guarantee "better" results.  The lvg module can add variants that
are inaccurate and create false positive returns from the dictionary.  For instance, lvg thinks
that the plural of the medication "dos" (docusate) is "doses" ... so the word "doses" in text
may incorrectly be tagged as the drug.  Chen Lin gets credit for discovering this specific


-----Original Message-----
From: Jessica Glover [mailto:glover.jessica.m@gmail.com] 
Sent: Monday, February 15, 2016 3:36 PM
To: dev@ctakes.apache.org
Subject: LVG documentation


I would like to add a brief explanation and an example in the LVG documentation as to why
it says in the Component Use Guide that LVG is effectively required for good results in dictionary
lookup, but before I do, I'd like to understand it a bit better myself.

I have an example sentence that yielded different results when I ran it through CuisOnlyUMLSProcessor
with and without LVGAnnotator enabled.

"Nasal canals are free of masses or apparent polyps."

No LVG: Identified Annotations: "Nasal", "polyps"
With LVG: Identified Annotations: "Nasal", "canals", "masses", "polyps"

My guess would be that the canonical (in this case, singular) form of these words is in the
UMLS dictionary but the word tokens themselves are not. Can I generalize to say that using
LVG gives a better chance of getting a dictionary hit for a missed word token by also looking
up relevant variants of that token?

View raw message