incubator-ctakes-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From blee...@apache.org
Subject svn commit: r1408965 - /incubator/ctakes/site/trunk/content/ctakes/3.0.0/user-guide-3.0.mdtext
Date Tue, 13 Nov 2012 21:14:17 GMT
Author: bleeker
Date: Tue Nov 13 21:14:17 2012
New Revision: 1408965

URL: http://svn.apache.org/viewvc?rev=1408965&view=rev
Log:
CMS commit to ctakes by bleeker

Modified:
    incubator/ctakes/site/trunk/content/ctakes/3.0.0/user-guide-3.0.mdtext

Modified: incubator/ctakes/site/trunk/content/ctakes/3.0.0/user-guide-3.0.mdtext
URL: http://svn.apache.org/viewvc/incubator/ctakes/site/trunk/content/ctakes/3.0.0/user-guide-3.0.mdtext?rev=1408965&r1=1408964&r2=1408965&view=diff
==============================================================================
--- incubator/ctakes/site/trunk/content/ctakes/3.0.0/user-guide-3.0.mdtext (original)
+++ incubator/ctakes/site/trunk/content/ctakes/3.0.0/user-guide-3.0.mdtext Tue Nov 13 21:14:17
2012
@@ -17,383 +17,128 @@ Notice:    Licensed to the Apache Softwa
            under the License.
 
 #cTAKES 3.0 User Guide
-These instructions are for end users. With these instructions you can install cTAKES, configure
it, and use it to process text (typically text associated with a medical record). If you were
planning to expand, change, or modify the code within cTAKES, refer to the [cTAKES 2.5 Developer
Install Instructions|cTAKES%2B2.5%2BDeveloper%2BInstall%2BInstructions.html].
-
-These instructions will cover installation and a test of the main product including trained
models for sentence detection and tagging parts of speech, dictionaries from a subset of the
UMLS, a very small subset of the full LVG resource, etc. Optional components will also be
described.
-
-Once you have finished installation of cTAKES, you will be able to see what cTAKES is capable
of. Further exploitation of the software's ability may require following a few additional
steps involving what dictionaries are being used. These are the last steps in these instructions.
-
-h2. Prerequisites
-
+These instructions are for end users. With these instructions you can
+install cTAKES, configure it, and use it to process text (typically text
+associated with a medical record). If you were planning to expand,
+change, or modify the code within cTAKES, refer to the [cTAKES 2.5
+Developer Install Instructions][].
+
+These instructions will cover installation and a test of the main
+product including trained models for sentence detection and tagging
+parts of speech, dictionaries from a subset of the UMLS, a very small
+subset of the full LVG resource, etc. Optional components will also be
+described.
+
+Once you have finished installation of cTAKES, you will be able to see
+what cTAKES is capable of. Further exploitation of the software’s
+ability may require following a few additional steps involving what
+dictionaries are being used. These are the last steps in these
+instructions.
+
+Prerequisites
+-------------
+
+<div class="table-wrap">
+<table class="confluenceTable">
+<tbody>
+<tr>
+<th class="confluenceTh">
 Step
 
+</th>
+<th class="confluenceTh">
 Example
 
-1. Make sure you have Java 1.6 or higher. Most systems come with Java already installed.
\\ Run this command to check your version. \\
-
+</th>
+</tr>
+<tr>
+<td class="confluenceTd">
+​1. Make sure you have Java 1.6 or higher. Most systems come with Java
+already installed. \
+ Run this command to check your version. \
+
+<div class="code panel" style="border-width: 1px;">
+<div class="codeContent panelContent">
+~~~~ {.theme: .Confluence; .brush: .plain; .gutter: .false
+style="font-size:12px;"}
 java -version
+~~~~
 
-If you do not you can install Java from [java.com|http://www.java.com/en/download/faq/develop.xml].
-
+</div>
+</div>
+If you do not you can install Java from [java.com][].
+
+</td>
+<td class="confluenceTd">
+<div class="code panel" style="border-width: 1px;">
+<div class="codeContent panelContent">
+~~~~ {.theme: .Confluence; .brush: .plain; .gutter: .false
+style="font-size:12px;"}
 C:\>java -version
 java version "1.6.0_20"
 Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
 Java HotSpot(TM) Client VM (build 16.3-b01, mixed mode, sharing)
+~~~~
 
-h2. Install cTAKES
-
+</div>
+</div>
+</td>
+</tr>
+</tbody>
+</table>
+</div>
+Install cTAKES
+--------------
+
+<div class="table-wrap">
+<table class="confluenceTable">
+<tbody>
+<tr>
+<th class="confluenceTh">
 Step
 
+</th>
+<th class="confluenceTh">
 Example
 
-1. Navigate to the [source downloads for a released version|http://sourceforge.net/projects/ohnlp/files/cTAKES]
on SourceForge
-
-\\
-
-2. Download the *cTAKES-2.5.zip* file. \\ Save the file to a temporary location on your machine.
-
-attachments/75014322/76808927.jpg
-
-3. Unzip (extract the contents of) the compressed file you downloaded into a directory that
you want to be the cTAKES install location. \\ For example, *Windows*: \\
-
+</th>
+</tr>
+<tr>
+<td class="confluenceTd">
+​1. Navigate to the [source downloads for a released version][] on
+SourceForge
+
+</td>
+<td class="confluenceTd">
+\
+
+</td>
+</tr>
+<tr>
+<td class="confluenceTd">
+​2. Download the **cTAKES-2.5.zip** file. \
+ Save the file to a temporary location on your machine.
+
+</td>
+<td class="confluenceTd">
+![screenshot illustrating step][]
+
+</td>
+</tr>
+<tr>
+<td class="confluenceTd">
+​3. Unzip (extract the contents of) the compressed file you downloaded
+into a directory that you want to be the cTAKES install location. \
+ For example, **Windows**: \
+
+<div class="code panel" style="border-width: 1px;">
+<div class="codeContent panelContent">
+~~~~ {.theme: .Confluence; .brush: .plain; .gutter: .false
+style="font-size:12px;"}
 c:\cTAKES-2.5
+~~~~
 
-\\*Linux*: \\
-
-/usr/bin/cTAKES-2.5
-
-\\ This folder we will call *<cTAKES_HOME>*. You will need to refer to the directory
later.
-
-attachments/75014322/76808930.jpg\\ |
-
-h2. Process documents using cTAKES
-
-This version allows you to test most components bundled in cTAKES in two different ways:
-
-# Using cTAKES CAS Visual Debugger (CVD) to view the results stored as XCAS files or run
the annotators or
-# Using cTAKES collection processing engine (CPE) to process documents in cTAKES_HOME/testdata
directory
-
-h3. CAS Visual Debugger (CVD)
-
-Step
-
-Example
-
-1. Open a command prompt and change to the cTAKES_HOME directory. \\*Windows*: \\
-
-cd \cTAKES-2.5
-
-\\*Linux*: \\
-
-cd /usr/bin/cTAKES-2.5images/icons/emoticons/warning.png*Note*\\
-
-cTAKES_HOME must be your current directory unless you are skilled at setting paths on your
machine.
-
-2. Start the CAS Visual Debugger by running this command: \\*Windows*: \\
-
-runctakesCVD.bat
-
-\\*Linux*: \\
-
-runctakesCVD.sh
-
-\\ The application may take a minute to start on slower hardware.
-
-attachments/75014322/74941782.png
-
-3. An analysis engine (AE) needs to be loaded in order to process text. \\ Use the *Run*
-> *Load AE* menu bar command. Navigate to the file
-
-<cTAKES_HOME>/cTAKESdesc/cdpdesc/analysis_engine/AggregatePlaintextProcessor.xml
-
-Click *Open*.
-
-attachments/75014322/74941783.png
-
-4. Copy the text in the example at the right (next cell) and paste the contents into the
Text section of CVD, replacing the text that is already there. \\ This example file can also
be found in test data:
-
-<cTAKES_HOME>/testdata/cdptest/testinput/plaintext/testpatient_plaintext_1.txt
-
-Dr. Nutritious \\\\ Medical Nutrition Therapy for Hyperlipidemia \\\\ Referral from: Julie
Tester, RD, LD, CNSD \\ Phone contact: (555) 555-1212 \\ Height: 144 cm Current Weight: 45
kg Date of current weight: 02-29-2001 \\ Admit Weight: 53 kg BMI: 18 kg/m2 \\ Diet: General
\\ Daily Calorie needs (kcals): 1500 calories, assessed as HB + 20% for activity. \\ Daily
Protein needs: 40 grams, assessed as 1.0 g/kg. \\ Pt has been on a 3-day calorie count and
has had an average intake of 1100 calories. \\ She was instructed to drink 2-3 cans of liquid
supplement to help promote weight gain. \\ She agrees with the plan and has my number for
further assessment. May want a Resting \\ Metabolic Rate as well. She takes an aspirin a day
for knee pain.
-
-3. From the menu bar, click *Run* -> *Run AggregatePlaintextProcessor*. \\\\ You'll get
a list of all the annotations in the Analysis Results frame.
-
-attachments/75014322/74941784.png
-
-4. Named entities are now recognized in this clinical document. Annotations of MedicationEventMention
and EntityMention are created. To find one, in the *Analysis Results frame*, click on the
key in front of: \\ AnnotationIndex \\ uima.tcas.Annotation \\ edu.mayo.bmi.uima.core.type.textsem.IdentifiedAnnotation
\\ edu.mayo.bmi.uima.core.type.textsem.EntityMention  \\ and \\ edu.mayo.bmi.uima.core.type.textsem.EventMention
\\ edu.mayo.bmi.uima.core.type.textsem.EventMention.MedicationEventMention \\   \\\\ Then
select *edu.mayo.bmi.uima.core.type.**textsem.**EntityMention* or *edu.mayo.bmi.uima.core.type.**textsem.**EventMention.**Medication**EventMention*.This
will show an Annotation Index in the lower frame. Select any annotation in that lower frame
and you will see the text discovered in the Text frame on the right. You may close CVD if
you wish.
-
-attachments/75014322/74941785.png
-
-h3. Collection processing engine (CPE)
-
-Step
-
-Example
-
-1. Open a command prompt and change to the cTAKES_HOME directory: \\*Windows*: \\
-
-cd C:\cTAKES2.5
-
-\\*Linux*: \\
-
-cd /usr/bin/cTAKES2.5images/icons/emoticons/warning.png*Note*\\
-
-Note that cTAKES_HOME must be your current directory unless you are skilled at setting paths
on your machine.
-
-2. Start the collection processing engine by running this command: \\*Windows*: \\
-
-runctakesCPE.bat
-
-\\*Linux*: \\
-
-runctakesCPE.sh
-
-\\ The application may take a minute to start on slower hardware.
-
-attachments/75014322/74941782.png
-
-3. This will bring up the Collection Processing Engine Configurator. In the Menu bar click
*File* > *Open CPE Descriptor*
-
-attachments/75014322/74941786.png
-
-4. Navigate to the file
-
-<cTAKES_HOME>/cTAKESdesc/cdpdesc/collection_processing_engine/test_plaintext.xml
-
-Click *Open*.
-
-attachments/75014322/74941773.png
-
-5. Click the Play button (green/blue *play arrow* near the bottom).
-
-attachments/75014322/74941774.png
-
-6. You should see that one document was processed. You did process a collection of documents.
In this case the collection only contained one just to show how to do it. Close the results
window.
-
-attachments/75014322/74941775.png
-
-7. Close the CPE application. You may be prompted to save changes. Since this was just a
test you may click the *No* button.
-
-attachments/75014322/74941776.png
-
-8. Open a new command prompt and change to the <cTAKES_HOME>
-
-No example.
-
-9. To test the results there is a comparison tool that will help show that the results match
expectations with the following syntax: \\
-
-java -cp cTAKES.jar edu.mayo.bmi.utils.xcas_comparison.Compare
-<First File> <Second File> <diff-html>
-
-Where: *_<First File>_* is the first file to compare; *_<Second File>_* is the
second file to compare; *_<diff-html>_* is where the results are written to \\\\ Copy
and paste the example at the right (next cell) which has had our example files already substituted
into a command prompt to run. In this case we have shipped an example of what the output should
be for you to compare against.
-
-*Windows*:
-
-java -cp cTAKES.jar edu.mayo.bmi.utils.xcas_comparison.Compare ^
-"testdata\cdptest\testoutput\plaintext\sample_note_plaintext.xml" ^
-"testdata\cdptest\testsampleoutput\plaintext\sample_note_plaintext.xml" ^
-c:\stuff\diff-html.html
-
-*Linux*:
-
-java edu.mayo.bmi.utils.xcas_comparison.Compare \
-"/usr/bin/cTAKES2.5/testdata/cdptest/testoutput/plaintext\sample_note_plaintext.xml" \
-"/usr/bin/cTAKES2.5/testdata/cdptest/testsampleoutput/plaintext/sample_note_plaintext.xml"
\
-/tmp/diff-html.html
-
-10. The resulting file will open for you. Look at the comparison to see the annotations resulting
from this pipeline. \\*Windows:*
-
-c:\stuff\diff-html.html
-
-*Linux*:
-
-/tmp/diff-html.html
-
-attachments/75014322/74941777.png
-
-Using the same CVD and CPE programs in the manner described above, you can test all the other
components. The analysis engines and collection processing engines shipped with cTAKES for
some of the annotators are described in the following table.
-
-Annotator
-
-Description
-
-Abbreviated
-
-Example Analysis Engine (AE)
-
-Example Collection processing Engine (CPE)
-
-Example test data
-
-Clinical Document Pipeline
-
-the complete cTAKES pipeline to obtain majority of cTAKES annotations
-
-cdp
-
-cTAKES_HOME/cTAKESdesc/cdpdesc/analysis_engine/AggregatePlaintextProcessor.xml
-
-cTAKES_HOME/cTAKESdesc/cdpdesc/collection_processing_engine/test_plaintext.xml
-
-cTAKES_HOME/testdata/cdptest
-
-Chunker
-
-obtain cTAKES chunking annotations
-
-chunker
-
-cTAKES_HOME/cTAKESdesc/chunkerdesc/analysis_engine/ChunkerAggregate.xml
-
-cTAKES_HOME/cTAKESdesc/chunkerdesc/collection_processing_engine/ChunkerCPE.xml
-
-cTAKES_HOME/testdata/chunkertest
-
-Dependency Parser
-
-obtain dependency parsing tree
-
-dp
-
-cTAKES_HOME/cTAKESdesc/dpdesc/analysis_engine/ClearParserTokenizedInfPosAggregate.xml
-
-cTAKES_HOME/cTAKESdesc/dpdesc/collection_processing_engine/ClearParserCPE.xml
-
-cTAKES_HOME/testdata/dptest
-
-Drug NER
-
-the annotator to obtain drug annotations
-
-drugner
-
-cTAKES_HOME/cTAKESdesc/drugnerdesc/analysis_engine/DrugAggregatePlaintextProcesor.xml
-
-cTAKES_HOME/cTAKESdesc/drugnerdesc/collection_processing_engine/DrugNER_PlainText_CPE.xml
-
-cTAKES_HOME/testdata/drugnertest
-
-Dictionary Lookup
-
-mapping cTAKES annotations to dictionaries (e.g., SNOMED_CT or RxNorm
-
-lookup
-
-cTAKES_HOME/cTAKESdesc/lookupdesc/analysis_engine/TestAggregateTAE.xml
-
-cTAKES_HOME/cTAKESdesc/lookupdesc/collection_processing_engine/LookupCPE.xml
-
-cTAKES_HOME/testdata/lookuptest
-
-PAD Term Spotter
-
-identifying terms related to PAD
-
-pad
-
-cTAKES_HOME/cTAKESdesc/paddesc/analysis_engine/Radiology_TermSpotterAnnotatorTAE.xml
-
-cTAKES_HOME/cTAKESdesc/paddesc/collection_processing_engine/Radiology_Sample.xml
-
-cTAKES_HOME/testdata/padtest
-
-Smoking Status
-
-the annotator to obtain document or patient-level smoking status
-
-smoking
-
-cTAKES_HOME/cTAKESdesc/smokingdesc/analysis_engine/SimulatedProdSmokingTAE.xml
-
-cTAKES_HOME/cTAKESdesc/smokingdesc/collection_processing_engine/Sample_SmokingStatus_output_flatfile.xml
-
-cTAKES_HOME/testdata/smokingtest
-
-Side Effect
-
-the annotator to find side effect mentions and sentences from clinical documents
-
-sideeffect
-
-cTAKES_HOME/cTAKESdesc/sideeffectdesc/analysis_engine/SideEffectAggregateTAE.xml
-
-cTAKES_HOME/cTAKESdesc/sideeffectdesc/collection_processing_engine/SideEffectCPE.xml
-
-cTAKES_HOME/testdata/sideeffecttest
-
-h2. []Next Steps
-
-The [cTAKES 2.5 Component Use Guide|/display/VKC/cTAKES+2.5+Component+Use+Guide] will help
you to understand in great detail each of the cTAKES components that have been installed.
In some cases you can learn how to improve the components. However, before you go on to process
text in production you will need to consider dictionaries and models.
-
-h3. []Dictionaries
-
-h4. []Bundled UMLS Dictionaries
-
-cTAKES includes the complete UMLS (SNOMED-CT and RxNorm) dictionaries.
-
-* An rxnorm_index database (a Lucene index) containing drug names from RxNorm
-* A UMLS database (using two hsqldb tables) containing anatomical sites, procedures, signs/symptoms,
and disorders/diseases from SNOMED-CT (umls_ms_2011ab)
-
-To use them, you must have a UMLS username and password, and an Internet connection.
-
-images/icons/emoticons/warning.png*Note*\\If you do not have a UMLS username and password,
you may request one at [UMLS Terminology Services|https://uts.nlm.nih.gov/license.html]
-
-In order to use the UMLS dictionaries shipped with cTAKES you will need to do two things:
-
-(1) Change the UMLSUser and UMLSPW <nameValuePair> strings in these descriptor files
with your UMLS username and password.
-
-* Dictionary Lookup: <cTAKES_HOME>/cTAKESdesc/lookupdesc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml
-* (optional) Drug NER: <cTAKES_HOME>/cTAKESdesc/drugnerdesc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml
-
-The following shows where in the files you would make the changes. (Do not change the <configurationParameters>
by the same name.)
-
-<nameValuePair>
-<name>UMLSUser</name>
-<value>
-<string>YOUR_UMLS_USERNAME_HERE</string>
-</value>
-</nameValuePair>
-<nameValuePair>
-<name>UMLSPW</name>
-<value>
-<string>YOUR_UMLS_PASSWORD_HERE</string>
-</value>
-</nameValuePair>
-
-(2) Include the DictionaryLookupAnnotatorUMLS.xml Analysis Engine within your aggregate Analysis
Engine or switch to the ones provided by cTAKES. cTAKES has provided duplicates of shipped
Analysis Engine descriptors, put UMLS in the name, and placed DictionaryLookupAnnotatorUMLS.xml
within them for these components:
-
-* Dictionary Lookup
-* Clinical Documents pipeline
-* Drug NER
-* Side Effect
-
-So you simply need to switch to using those descriptors. For example, if you were using AggregateCdaProcessor.xml
in the Clinical Documents pipeline you would switch to using AggregateCdaUMLSProcessor.xml
instead and you will now hook into the complete dictionaries.
-
-You can, of course, modify your own aggregate Analysis Engine files and place the DictionaryLookupAnnotatorUMLS.xml
Analysis Engine within them.\\ Since this is an in-memory database implementation, please
be patient during the initial load as it could take approximately 20-30 seconds for the database
to initialize.
-
-If you would like to go back to using the small sample dictionaries that do not require a
UMLS username, use the DictionaryLookupAnnotator.xml (UMLS is not in the file name) Analyis
Engine descriptor in your aggregate. Just removing your password from the DictionaryLookupAnnotatorUMLS.xml
files will not switch you back to the small sample dictionaries.
-
-h4. []LVG
-
-We have successfully tested the 2008 release of the full [LVG|http://lexsrv2.nlm.nih.gov/LexSysGroup/Projects/lvg/current/docs/userDoc/tools/lvg.html]
data. In order to use this release of the full LVG data you should:
-
-# Download either the full version or the lite version from [NIH Lexical Tools|http://lexsrv2.nlm.nih.gov/LexSysGroup/Projects/lvg/2008/web/download.html]
-# Extract the TGZ file that you downloaded with a tool like 7-zip (available online) to a
temporary directory. On some operating systems, like Windows, this may need to be done in
two steps, 1) to uncompress and 2) to unzip.
-# Replace the directory <cTAKES_HOME>/resources/lvgresources/lvg/data/HSqlDb with data/HSqlDb
from your extracted download. Replacing the entire directory is appropriate.
-# In the future, you can upgrade to later versions of LVG by editing the <cTAKES_HOME>/resources/lvgresources/lvg/data/config/lvg.properties
file, replacing "lvg2008" with the name of the new release.
-
-h4. []Building Your Own Dictionaries
-
-To install customized dictionaries for RxNorm, SNOMED-CT, or other vocabularies that are
available through the UMLS, see the following posts on the cTAKES forums:
-
-* [https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=423]
-* [https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=80&start=20#p1459]
-
-h3. []Models
-
-Some models included in cTAKES may not represent your data distribution well. If you want
to build or train your own models, please read the [cTAKES 2.5 Component Use Guide|/display/VKC/cTAKES+2.5+Component+Use+Guide],
particularly:
-
-* [Training a sentence detector model|https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5+-+Core#cTAKES2.5-Core-ToolsTrainingasentencedetectormodel]
-* Training a Part of Speech (POS) tagger model (Building a model Obtaining training data)
-* Creating a Part of Speech (POS) tag dictionary (Building a tag dictionary)
-* Training a chunker model (Building a model - Prepare GENIA training data)
-* Training a dependency parser (Dependency Parser)
\ No newline at end of file
+  [cTAKES 2.5 Developer Install Instructions]: cTAKES%2B2.5%2BDeveloper%2BInstall%2BInstructions.html
+  [java.com]: http://www.java.com/en/download/faq/develop.xml
+  [source downloads for a released version]: http://sourceforge.net/projects/ohnlp/files/cTAKES
+  [screenshot illustrating step]: attachments/75014322/76808927.jpg
\ No newline at end of file



Mime
View raw message