ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen, Pei" <Pei.C...@childrens.harvard.edu>
Subject RE: cTAKES Translation
Date Fri, 15 Nov 2013 13:49:28 GMT
Hi Roberto,
Welcome!  

In theory, in order to have cTAKES work in a different language, we would just need to:
-Retrain the existing ML-based models for the language and code should just work as is for
-Update any hard-coded rules
-Use the Spanish dictionary for concepts (I believe UMLS already has a Spanish translation
for some of their thesauruses).
I think it would awesome to have cTAKES work with multiple languages including Spanish!
Actually, a lot of folks have been asking about cTAKES models in different languages.
The challenging thing with the supervised machine learning methods is that we'll have to rely
on local domain experts to create the gold standard for training.
There is a group that may be contributing retrained models for cTAKES to work in French.
Others can feel free to chime in...

--Pei

> -----Original Message-----
> From: Roberto Costumero Moreno [mailto:roberto.costumero@upm.es]
> Sent: Thursday, November 14, 2013 5:43 AM
> To: dev@ctakes.apache.org
> Subject: cTAKES Translation
> 
> Hello everyone,
> 
> My name is Roberto Costumero and I am working for the Technical University
> of Madrid in Spain doing my Ph.D. studies and I am new to this list, so I am
> introducing myself and posting some doubts I have.
> 
> We are currently involved in a project together with several hospitals and we
> are working closely with them into getting to know their necessities in order
> to build an application for them to use the knowledge of their clinical notes,
> imaging among other things.
> 
> We have been looking for different projects to see which one will fits our
> needs and, of course, which will we will share our investigations with. Among
> the different projects we have seen in the field of clinical text analysis we
> think that cTAKES is the best one out there and it is very well structured and
> organized, but the main problem we are facing is that every clinical text-
> based NLP project is developed for English and we will be working with
> Spanish texts.
> 
> We have already done some work for testing different algorithms translating
> them to Spanish to detect negation and context dependency but we would
> like to use a well-tested complete framework to work with, so we thought
> about cTAKES, so I have a couple of questions for you.
> 
> - Does anyone know if someone is already working in translating cTAKES
> modules to work with other languages (Spanish in particular)?
> - Do you think it would be very difficult to do it because of any architectural
> design I am not currently aware of?
> - Do you think it would be a good line of development (for the cTAKES
> project) to extend cTAKES to work together into translating it to Spanish in
> this case?
> 
> Thank you very much in advance for your help.
> 
> Sincerely,
> 
> --
> Roberto Costumero Moreno
> Laboratorio de Minería de Datos y Simulación (MIDAS) Centro de Tecnología
> Biomédica Universidad Politecnica de Madrid roberto.costumero@upm.es
> Tlf: +34 91 336 4664


Mime
View raw message