ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Finan, Sean" <Sean.Fi...@childrens.harvard.edu>
Subject RE: Lucene for UMLS2014
Date Mon, 21 Jul 2014 22:28:00 GMT
Hi Harpreet,

If you are willing to use cTakes 3.2, try the dictionary-lookup-fast module as a replacement
of the default dictionary-lookup.  That module has a new dictionary resource (hsql, not lucene)
and slightly different methods for lookup and matching.  In time trials it has been faster
than the default module (hence the name).  Accuracy depends upon the parameter settings, but
in the tests performed so far the results are comparable or better.  The new dictionary is
much leaner than the current default dictionary, small enough to port from the hsql cached
version to a hsql in-memory version.  Using the in-memory version makes dictionary lookup
practically instantaneous (hundredths of a second).  Limited documentation is available in
the module's doc/ directory.

I will be on vacation for a week, but please don't hesitate to write if you have any questions.

From: Harpreet Khanduja [hsk5004@rit.edu]
Sent: Thursday, July 17, 2014 5:07 PM
To: dev@ctakes.apache.org
Subject: Lucene for UMLS2014

    I would be grateful if someone could help.

    I created a lucene index for umls2014 but only for snomed vocabulary.
    I did this because I thought this would reduce the dictionary look up
    But it still almost the same. Is there any other way to improve the
dictionary look up time?

Thank you,

View raw message