ctakes-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eli mizzou <eli.mi...@gmail.com>
Subject RE: Clinical Pipeline Solution
Date Tue, 22 Oct 2013 00:20:04 GMT
Hi cTAKES folks,

I am trying to figure out how to run the Clinical Document Pipeline from
Java.  I have a set of clinical documents as plain texts. I want to parse
these documents and extract a list of <doc_ID, CUI, freq> that is in
document *doc_ID*, there is *CUI* with frequency of *freq*. I spent several
days installing cTAKES and looking for a solution. I narrow it down to
ClinicalPipelineWithUmls.java where gets a test and runs SimplePipeline
with a AnaylisisEngineDescription. Here is a part of the code:

String documentText = "Text of document to test goes here, such as the
following. No edema, some soreness, denies pain."; InputStream inStream =
InputStreamCollectionReader.convertToByteArrayInputStream(documentText);
CollectionReader collectionReader =
InputStreamCollectionReader.getCollectionReader(inStream);
AnalysisEngineDescription pipelineIncludingUmlsDictionaries =
AnalysisEngineFactory.createAnalysisEngineDescription(
"desc/analysis_engine/AggregatePlaintextUMLSProcessor");
AnalysisEngineDescription xWriter =
AnalysisEngineFactory.createPrimitiveDescription( XWriter.class,
XWriter.PARAM_OUTPUT_DIRECTORY_NAME, AssertionConst.evalOutputDir,
XWriter.PARAM_XML_SCHEME_NAME, XWriter.XMI,
XWriter.PARAM_FILE_NAMER_CLASS_NAME, CtakesFileNamer.class.getName());
SimplePipeline.runPipeline(collectionReader,
pipelineIncludingUmlsDictionaries, xWriter); System.out.println("Done at "
+ new Date());

The problem is it can not find "*InputStreamCollectionReader*". I searched
for it but no success so far! Would you please give me a hint or show some
directions?


Thanks for any help!


-Eli

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message