ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Savova, Guergana" <Guergana.Sav...@childrens.harvard.edu>
Subject RE: Exploiting the power of cTakes, using OpenNLP only
Date Fri, 22 May 2015 12:04:08 GMT
Yes, you are correct. cTAKES does named entity recognition and normalization=mapping to an
ontology (through the UMLS). The normalization part is what is different from what is usually
done in the general domain (where mentions of several semantic types are discovered but not
necessarily normalized to a concept within an ontology). In the general domain, there is a
recent trend to normalize to Wikipedia (wikification).

In short, to do the NER in cTAKES you do need a license for the UMLS. BTW, that license is
free for level 0 vocabularies.

Hope this information helps.
--Guergana

-----Original Message-----
From: Damir Olejar [mailto:olejar.damir@gmail.com] 
Sent: Friday, May 22, 2015 7:51 AM
To: dev@ctakes.apache.org
Subject: Re: Exploiting the power of cTakes, using OpenNLP only

To answer my own question, it all comes down to UMLS licensing, and which files are being
downloaded from the server.
The files that are downloaded are compressed *.model files that can be integrated with cTakes.
However, there is (or might be in the near future) a restriction to which user can download
which files, and also, there might be a copyright issue if the UMLS procedure is not followed.

So, yes, there is no need for UIMA, but then, for any serious work, the copyrights need to
be respected.


On Thu, May 21, 2015 at 12:10 PM, Damir Olejar <olejar.damir@gmail.com>
wrote:

> To whom it may concern,
>
> First, I would like to apologize if my question is vague, since I am 
> new and unaccustomed to the cTakes diction. To keep my question simple 
> and up to a point, let us assume that I am working only with an Apache 
> OpenNLP. I do not have any UIMA-specific JAR files included, and let 
> us assume that I do not want to include any of them (or keep it to a 
> minimum), thus keeping the project confined to OpenNLP as much as possible.
>
> As far as I know, UIMA is just a framework that does not provide any 
> specific NLP tools (source:
> https://urldefense.proofpoint.com/v2/url?u=http-3A__stackoverflow.com_questions_24186742_is-2Duima-2Dprovides-2Donly-2Da-2Dwrapper-2Dor-2Dis-2Dit-2Dlike-2Dstandfordcore-2Dnlp-2Dand-2Dgate&d=BQIFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=umFvmAvfVN2FIHuugFp5H33UdNyy-mxG3U3yDPRMp9I&s=uM0wOUdg63NBJRXD3JRZeU0fx-jT8ide6bcZdx_-WY8&e=
).
> This means that there should be a way of integrating the cTakes 
> components with OpenNLP.
>
> What I would like to do is to simply have the Name Entity Recognition
> (NER) applied to a text, so I know which word from an inputted 
> sentence is a medical term.  The perfect option would be if I could 
> have a *.bin file such as "en-ner-person.bin”, but I think that cTakes 
> does not give us such an option, since there are no *.bin files.
>
> How would I accomplish such a task? Would there be any code, examples, 
> tutorials, documentations, pseudo-code, ideas ,… to take a look at?
>
> Thank you kindly for your time, understanding, and a patience.
>
> Damir
>
Mime
View raw message