ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dligach, Dmitriy" <Dmitriy.Dlig...@childrens.harvard.edu>
Subject Re: lvg entries
Date Thu, 17 Apr 2014 16:30:38 GMT
Tim, this is a very interesting observation. Could you please send a few examples of what LVG
generates? Both sensical and non :)

Dima




On Apr 17, 2014, at 11:28, Miller, Timothy <Timothy.Miller@childrens.harvard.edu> wrote:

> The LVG annotator creates an enormous number of "lemmas" for every
> WordToken in the CAS, and I'm wondering what the original purpose was? I
> think this is probably a minor bottleneck for speed but mostly a pretty
> big space hog (at least 50% of the space of xmi files in my tests).
> 
> As of right now I'm not sure if any downstream components are using
> these lemmas, and on a manual inspection the precision seems to be
> pretty abysmal (meaning most of them are nonsensical as lexical
> variants), so as I said, just wondering if we can revisit why cTAKES
> generates so many and whether that component can be optimized.
> 
> Thanks
> Tim
> 


Mime
View raw message