incubator-ctakes-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chen...@apache.org
Subject svn commit: r1417000 - /incubator/ctakes/site/trunk/content/ctakes/index.mdtext
Date Tue, 04 Dec 2012 15:45:22 GMT
Author: chenpei
Date: Tue Dec  4 15:45:21 2012
New Revision: 1417000

URL: http://svn.apache.org/viewvc?rev=1417000&view=rev
Log:
Adding more details about the components

Modified:
    incubator/ctakes/site/trunk/content/ctakes/index.mdtext

Modified: incubator/ctakes/site/trunk/content/ctakes/index.mdtext
URL: http://svn.apache.org/viewvc/incubator/ctakes/site/trunk/content/ctakes/index.mdtext?rev=1417000&r1=1416999&r2=1417000&view=diff
==============================================================================
--- incubator/ctakes/site/trunk/content/ctakes/index.mdtext (original)
+++ incubator/ctakes/site/trunk/content/ctakes/index.mdtext Tue Dec  4 15:45:21 2012
@@ -17,34 +17,33 @@ Notice:    Licensed to the Apache Softwa
            under the License.
 
 # Welcome to Apache cTAKES (incubating).
-Apache cTAKES: clinical Text Analysis and Knowledge Extraction System is an open-source natural
language processing system for information extraction from electronic medical record clinical
free-text. It processes clinical notes, identifying types of clinical named entities - medications,
diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has
attributes for the text span, the ontology mapping code, subject (patient, family member,
etc.) and context (negated/not negated, conditional, generic).
+Apache clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source
natural language processing system for information extraction from electronic medical record
clinical free-text. It processes clinical notes, identifying types of clinical named entities
from various dictionaries including the Unified Medical Language System ([UMLS](http://www.nlm.nih.gov/research/umls/))
- medications, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named
entity has attributes for the text span, the ontology mapping code, subject (patient, family
member, etc.) and context (negated/not negated, conditional, generic, degree of certainty).
Some of the attributes are expressed as relations, for example the location of a clinical
condition (locationOf relation) or the severity of a clinical condition (degreeOf relation).

 
-Apache cTAKES was built using the UIMA Unstructured Information Management Architecture framework
and OpenNLP natural language processing toolkit. Its components are specifically trained for
the clinical domain, and create rich linguistic and semantic annotations that can be utilized
by clinical decision support systems and clinical research.
+Apache cTAKES was built using the Apache UIMA Unstructured Information Management Architecture
engineering framework and Apache OpenNLP natural language processing toolkit. Its components
are specifically trained for the clinical domain out of diverse manually annotated datasets,
and create rich linguistic and semantic annotations that can be utilized by clinical decision
support systems and clinical research. cTAKES has been used in a variety of use cases in the
domain of biomedicine such as phenotype discovery, translational science, pharmacogenomics
and pharmacogenetics.
 
-These components include:
+Apache cTAKES employs a number of rule-based and machine learning methods. Apache cTAKES
employs a number of rule-based and machine learning methods. Apache cTAKES components include:
  
-  - Sentence boundary detection (OpenNLP technology)
-  - Tokenization (rule-based)
-  - Morphologic normalization (NLM's LVG)
-  - POS tagging (OpenNLP technology)
-  - Shallow parsing (OpenNLP technology)
-  - Named Entity Recognition
-    1. Dictionary mapping (lookup algorithm)
-    2. Semantic types: diseases/disorders, signs/symptoms, anatomical sites, procedures,
medications
-  - Assertion module
-  - Dependency parser
-  - Constituency parser 
-  - Semantic Role Labeler 
-  - Coreference resolver 
-  - Drug Profile module
-  - Smoking status classifier
+   1. Sentence boundary detection
+   1. Tokenization (rule-based)
+   1. Morphologic normalization
+   1. POS tagging
+   1. Shallow parsing
+   1. Named Entity Recognition
+      - Dictionary mapping
+      - Semantic typing is based on these UMLS semantic types: diseases/disorders, signs/symptoms,
anatomical sites, procedures, medications
+   1. Assertion module
+   1. Dependency parser
+   1. Constituency parser
+   1. Semantic Role Labeler
+   1. Coreference resolver
+   1. Relation extractor
+   1. Drug Profile module
+   1. Smoking status classifier
 
-The goal of cTAKES is to be a world-class natural language processing system in the healthcare
domain.
-cTAKES can be used in a great variety of retrievals and use cases. It is intended to be modular
and expandable at the information model and method level.
 
-The cTAKES community is committed to best practices and R&D (research and development)
by using cutting edge technologies and novel research. 
-The idea is to translate the working software quickly into cTAKES code.
+The goal of cTAKES is to be a world-class natural language processing system in the healthcare
domain. cTAKES can be used in a great variety of retrievals and use cases. It is intended
to be modular and expandable at the information model and method level.
+The cTAKES community is committed to best practices and R&D (research and development)
by using cutting edge technologies and novel research. The idea is to quickly translate the
best performing methods into cTAKES code.
+
 
 # Incubation
 Apache cTAKES is an effort undergoing incubation at The Apache Software Foundation (ASF),
sponsored by the Apache Incubator PMC. Incubation is required of all newly accepted projects
until a further review indicates that the infrastructure, communications, and decision making
process have stabilized in a manner consistent with other successful ASF projects. While incubation
status is not necessarily a reflection of the completeness or stability of the code, it does
indicate that the project has yet to be fully endorsed by the ASF.
-



Mime
View raw message