ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject RE: cTakes Annotation Comparison
Date Fri, 19 Dec 2014 19:16:59 GMT
Kim,
Maintenance is the factor not bugs/issue to forge ahead.
They are 2 components that do the same thing with the same goal (As Sean mentioned, one should
be able configure the new code base to  replicate the old algorithm if required- it’s just
a simpler and cleaner code base.  If this is not the case or if there are issues, we should
fix it and move forward.).
We can keep the old component around for as long as needed, but it’s likely going to have
limited support…
--Pei

From: Kim Ebert [mailto:kim.ebert@imatsolutions.com]
Sent: Friday, December 19, 2014 1:47 PM
To: Chen, Pei; dev@ctakes.apache.org
Subject: Re: cTakes Annotation Comparison

Pei,

I don't think bugs/issues should be part of determining if one algorithm vs the other is superior.
Obviously, it is worth mentioning the bugs, but if the fast lookup method has worse precision
and recall but better performance, vs the slower but more accurate first word lookup algorithm,
then time should be invested in fixing those bugs and resolving those weird issues.

Now I'm not saying which one is superior in this case, as the data will end up speaking for
itself one way or the other; bus as of right now, I'm not convinced yet that the old dictionary
lookup is obsolete yet, and I'm not sure the community is convinced yet either.

[IMAT Solutions]<http://imatsolutions.com>
Kim Ebert
Software Engineer
[Office:]801.669.7342
kim.ebert@imatsolutions.com<mailto:greg.hubert@imatsolutions.com>
On 12/19/2014 08:39 AM, Chen, Pei wrote:
Also check out stats that Sean ran before releasing the new component on:
http://svn.apache.org/repos/asf/ctakes/trunk/ctakes-dictionary-lookup-fast/doc/DictionaryLookupStats.docx
From the evaluation and experience, the new lookup algorithm should be a huge improvement
in terms of both speed and accuracy.
This is very different than what Bruce mentioned…  I’m sure Sean will chime here.
(The old dictionary lookup is essentially obsolete now- plagued with bugs/issues as you mentioned.)
--Pei

From: Kim Ebert [mailto:kim.ebert@perfectsearchcorp.com]
Sent: Friday, December 19, 2014 10:25 AM
To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>
Subject: Re: cTakes Annotation Comparison

Guergana,

I'm curious to the number of records that are in your gold standard sets, or if your gold
standard set was run through a long running cTAKES process. I know at some point we fixed
a bug in the old dictionary lookup that caused the permutations to become corrupted over time.
Typically this isn't seen in the first few records, but over time as patterns are used the
permutations would become corrupted. This caused documents that were fed through cTAKES more
than once to have less codes returned than the first time.

For example, if a permutation of 4,2,3,1 was found, the permutation would be corrupted to
be 1,2,3,4. It would no longer be possible to detect permutations of 4,2,3,1 until cTAKES
was restarted. We got the fix in after the cTAKES 3.2.0 release. https://issues.apache.org/jira/browse/CTAKES-310
Depending upon the corpus size, I could see the permutation engine eventually only have a
single permutation of 1,2,3,4.

Typically though, this isn't very easily detected in the first 100 or so documents.

We discovered this issue when we made cTAKES have consistent output of codes in our system.

[IMAT Solutions]<http://imatsolutions.com>
Kim Ebert
Software Engineer
[Office:]801.669.7342
kim.ebert@imatsolutions.com<mailto:greg.hubert@imatsolutions.com>
On 12/19/2014 07:05 AM, Savova, Guergana wrote:

We are doing a similar kind of evaluation and will report the results.



Before we released the Fast lookup, we did a systematic evaluation across three gold standard
sets. We did not see the trend that Bruce reported below. The P, R and F1 results from the
old dictionary look up and the fast one were similar.



Thank you everyone!

--Guergana



-----Original Message-----

From: David Kincaid [mailto:kincaid.dave@gmail.com]

Sent: Friday, December 19, 2014 9:02 AM

To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>

Subject: Re: cTakes Annotation Comparison



Thanks for this, Bruce! Very interesting work. It confirms what I've seen in my small tests
that I've done in a non-systematic way. Did you happen to capture the number of false positives
yet (annotations made by cTAKES that are not in the human adjudicated standard)? I've seen
a lot of dictionary hits that are not actually entity mentions, but I haven't had a chance
to do a systematic analysis (we're working on our annotated gold standard now). One great
example is the antibiotic "Today". Every time the word today appears in any text it is annotated
as a medication mention when it almost never is being used in that sense.



These results by themselves are quite disappointing to me. Both the UMLSProcessor and especially
the FastUMLSProcessor seem to have pretty poor recall. It seems like the trade off for more
speed is a ten-fold (or more) decrease in entity recognition.



Thanks again for sharing your results with us. I think they are very useful to the project.



- Dave



On Thu, Dec 18, 2014 at 5:06 PM, Bruce Tietjen < bruce.tietjen@perfectsearchcorp.com<mailto:bruce.tietjen@perfectsearchcorp.com>>
wrote:



Actually, we are working on a similar tool to compare it to the human

adjudicated standard for the set we tested against.  I didn't mention

it before because the tool isn't complete yet, but initial results for

the set (excluding those marked as "CUI-less") was as follows:



Human adjudicated annotations: 4591 (excluding CUI-less)



Annotations found matching the human adjudicated standard

UMLSProcessor                  2245

FastUMLSProcessor           215













 [image: IMAT Solutions] <http://imatsolutions.com><http://imatsolutions.com>
 Bruce Tietjen

Senior Software Engineer

[image: Mobile:] 801.634.1547

bruce.tietjen@imatsolutions.com<mailto:bruce.tietjen@imatsolutions.com>



On Thu, Dec 18, 2014 at 3:37 PM, Chen, Pei

<Pei.Chen@childrens.harvard.edu<mailto:Pei.Chen@childrens.harvard.edu>



wrote:



Bruce,

Thanks for this-- very useful.

Perhaps Sean Finan comment more-

but it's also probably worth it to compare to an adjudicated human

annotated gold standard.



--Pei



-----Original Message-----

From: Bruce Tietjen [mailto:bruce.tietjen@perfectsearchcorp.com]

Sent: Thursday, December 18, 2014 1:45 PM

To: dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>

Subject: cTakes Annotation Comparison



With the recent release of cTakes 3.2.1, we were very interested in

checking for any differences in annotations between using the

AggregatePlaintextUMLSProcessor pipeline and the

AggregatePlanetextFastUMLSProcessor pipeline within this release of

cTakes

with its associated set of UMLS resources.



We chose to use the SHARE 14-a-b Training data that consists of 199

documents (Discharge  61, ECG 54, Echo 42 and Radiology 42) as the

basis for the comparison.



We decided to share a summary of the results with the development

community.



Documents Processed: 199



Processing Time:

UMLSProcessor           2,439 seconds

FastUMLSProcessor    1,837 seconds



Total Annotations Reported:

UMLSProcessor                  20,365 annotations

FastUMLSProcessor             8,284 annotations





Annotation Comparisons:

Annotations common to both sets:                                  3,940

Annotations reported only by the UMLSProcessor:         16,425

Annotations reported only by the FastUMLSProcessor:    4,344





If anyone is interested, following was our test procedure:



We used the UIMA CPE to process the document set twice, once using

the AggregatePlaintextUMLSProcessor pipeline and once using the

AggregatePlaintextFastUMLSProcessor pipeline. We used the

WriteCAStoFile CAS consumer to write the results to output files.



We used a tool we recently developed to analyze and compare the

annotations generated by the two pipelines. The tool compares the

two outputs for each file and reports any differences in the

annotations (MedicationMention, SignSymptomMention,

ProcedureMention, AnatomicalSiteMention, and

DiseaseDisorderMention) between the two output sets. The tool

reports the number of 'matches' and 'misses' between each annotation set. A 'match'

is

defined as the presence of an identified source text interval with

its associated CUI appearing in both annotation sets. A 'miss' is

defined as the presence of an identified source text interval and

its associated CUI in one annotation set, but no matching identified

source text interval

and

CUI in the other. The tool also reports the total number of

annotations (source text intervals with associated CUIs) reported in

each annotation set. The compare tool is in our GitHub repository at

https://github.com/perfectsearch/cTAKES-compare






Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message