Return-Path: X-Original-To: apmail-incubator-ctakes-commits-archive@minotaur.apache.org Delivered-To: apmail-incubator-ctakes-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2B59CDE45 for ; Fri, 21 Dec 2012 16:31:15 +0000 (UTC) Received: (qmail 83651 invoked by uid 500); 21 Dec 2012 16:31:15 -0000 Delivered-To: apmail-incubator-ctakes-commits-archive@incubator.apache.org Received: (qmail 83623 invoked by uid 500); 21 Dec 2012 16:31:15 -0000 Mailing-List: contact ctakes-commits-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: ctakes-dev@incubator.apache.org Delivered-To: mailing list ctakes-commits@incubator.apache.org Received: (qmail 83616 invoked by uid 99); 21 Dec 2012 16:31:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Dec 2012 16:31:15 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Dec 2012 16:31:11 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id F11ED2388980; Fri, 21 Dec 2012 16:30:50 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1425006 [1/3] - in /incubator/ctakes/trunk/ctakes-assertion-zoner: ./ .settings/ desc/ desc/cpe/ metadata/ src/ src/main/ src/main/java/ src/main/java/org/ src/main/java/org/mitre/ src/main/java/org/mitre/medfacts/ src/main/java/org/mitre/... Date: Fri, 21 Dec 2012 16:30:49 -0000 To: ctakes-commits@incubator.apache.org From: mattcoarr@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20121221163050.F11ED2388980@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: mattcoarr Date: Fri Dec 21 16:30:48 2012 New Revision: 1425006 URL: http://svn.apache.org/viewvc?rev=1425006&view=rev Log: adding the ctakes-assertion's zoner component (this is the assertion module's version of a sectionizer) Added: incubator/ctakes/trunk/ctakes-assertion-zoner/ incubator/ctakes/trunk/ctakes-assertion-zoner/.classpath incubator/ctakes/trunk/ctakes-assertion-zoner/.project incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/ incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.core.resources.prefs incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.jdt.core.prefs (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.m2e.core.prefs incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptor.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptorStyleMap.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/desc/cpe/ incubator/ctakes/trunk/ctakes-assertion-zoner/desc/cpe/zonermeds4.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/metadata/ incubator/ctakes/trunk/ctakes-assertion-zoner/metadata/install.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml.backup01 incubator/ctakes/trunk/ctakes-assertion-zoner/src/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/RunZoner.java incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/ZoneAnnotator.java (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/META-INF/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/META-INF/org.uimafit/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/META-INF/org.uimafit/types.txt incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/zoner/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/zoner/types/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/zoner/types/TypeSystem.xml incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.04162012.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.06162011.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.out.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.xml (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.xml.bak (with props) incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.xml.bak2 (with props) Added: incubator/ctakes/trunk/ctakes-assertion-zoner/.classpath URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/.classpath?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/.classpath (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/.classpath Fri Dec 21 16:30:48 2012 @@ -0,0 +1,37 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Added: incubator/ctakes/trunk/ctakes-assertion-zoner/.project URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/.project?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/.project (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/.project Fri Dec 21 16:30:48 2012 @@ -0,0 +1,24 @@ + + + UIMA AFEM Zone Annotator + + + + + + org.eclipse.jdt.core.javabuilder + + + + + org.eclipse.m2e.core.maven2Builder + + + + + + org.eclipse.m2e.core.maven2Nature + org.eclipse.jdt.core.javanature + org.apache.uima.pear.UimaNature + + Added: incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.core.resources.prefs URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.core.resources.prefs?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.core.resources.prefs (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.core.resources.prefs Fri Dec 21 16:30:48 2012 @@ -0,0 +1,5 @@ +eclipse.preferences.version=1 +encoding//src/main/java=UTF-8 +encoding//src/main/resources=UTF-8 +encoding//target/generated-sources/jcasgen=UTF-8 +encoding/=UTF-8 Added: incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.jdt.core.prefs URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.jdt.core.prefs?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.jdt.core.prefs (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.jdt.core.prefs Fri Dec 21 16:30:48 2012 @@ -0,0 +1,12 @@ +eclipse.preferences.version=1 +org.eclipse.jdt.core.compiler.codegen.inlineJsrBytecode=enabled +org.eclipse.jdt.core.compiler.codegen.targetPlatform=1.6 +org.eclipse.jdt.core.compiler.codegen.unusedLocal=preserve +org.eclipse.jdt.core.compiler.compliance=1.6 +org.eclipse.jdt.core.compiler.debug.lineNumber=generate +org.eclipse.jdt.core.compiler.debug.localVariable=generate +org.eclipse.jdt.core.compiler.debug.sourceFile=generate +org.eclipse.jdt.core.compiler.problem.assertIdentifier=error +org.eclipse.jdt.core.compiler.problem.enumIdentifier=error +org.eclipse.jdt.core.compiler.problem.forbiddenReference=warning +org.eclipse.jdt.core.compiler.source=1.6 Propchange: incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.jdt.core.prefs ------------------------------------------------------------------------------ svn:executable = * Added: incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.m2e.core.prefs URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.m2e.core.prefs?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.m2e.core.prefs (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/.settings/org.eclipse.m2e.core.prefs Fri Dec 21 16:30:48 2012 @@ -0,0 +1,4 @@ +activeProfiles= +eclipse.preferences.version=1 +resolveWorkspaceProjects=true +version=1 Added: incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptor.xml URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptor.xml?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptor.xml (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptor.xml Fri Dec 21 16:30:48 2012 @@ -0,0 +1,92 @@ + + + org.apache.uima.java + true + org.mitre.medfacts.uima.Zoner + + ZonerDescriptor + + 1.0 + MITRE + + + + + + org.mitre.medfacts.uima.Heading + a section heading + uima.tcas.Annotation + + + label + + uima.cas.String + + + + + org.mitre.medfacts.uima.Zone + A document Zone, including its heading + uima.tcas.Annotation + + + label + + uima.cas.String + + + + + org.mitre.medfacts.uima.Subzone + + uima.tcas.Annotation + + + label + + uima.cas.String + + + + + + + + + + + + + + + + true + true + false + + + + + SectionRegex + + false + + + + + + SectionHeaderRegularExpressions + + + file:org/mitre/medfacts/uima/section_regex.xml + + + + + + SectionRegex + SectionHeaderRegularExpressions + + + + Propchange: incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptor.xml ------------------------------------------------------------------------------ svn:executable = * Added: incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptorStyleMap.xml URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptorStyleMap.xml?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptorStyleMap.xml (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptorStyleMap.xml Fri Dec 21 16:30:48 2012 @@ -0,0 +1,3 @@ + + + Propchange: incubator/ctakes/trunk/ctakes-assertion-zoner/desc/ZonerDescriptorStyleMap.xml ------------------------------------------------------------------------------ svn:executable = * Added: incubator/ctakes/trunk/ctakes-assertion-zoner/desc/cpe/zonermeds4.xml URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/desc/cpe/zonermeds4.xml?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/desc/cpe/zonermeds4.xml (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/desc/cpe/zonermeds4.xml Fri Dec 21 16:30:48 2012 @@ -0,0 +1,58 @@ + + + + + + + + + + InputDirectory + + C:\a_projects\AFEM\i2b2 data\2009_Medications\TrainingDataText\4 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + OutputDirectory + + C:\a_projects\AFEM\i2b2 data\uimaout\Meds4 + + + + + + + -1 + immediate + + + + Propchange: incubator/ctakes/trunk/ctakes-assertion-zoner/desc/cpe/zonermeds4.xml ------------------------------------------------------------------------------ svn:executable = * Added: incubator/ctakes/trunk/ctakes-assertion-zoner/metadata/install.xml URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/metadata/install.xml?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/metadata/install.xml (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/metadata/install.xml Fri Dec 21 16:30:48 2012 @@ -0,0 +1,27 @@ + + + + Windows + + + 1.4.0 + + + + + UIMA_AFEM_Zone_Annotator + + $main_root/desc/ZonerDescriptor.xml + standard + + + + set_env_variable + + + $main_root/bin;$main_root/resources;$main_root/lib/med-facts-zoner-1.0-SNAPSHOT.jar; + CLASSPATH + + + + Propchange: incubator/ctakes/trunk/ctakes-assertion-zoner/metadata/install.xml ------------------------------------------------------------------------------ svn:executable = * Added: incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml Fri Dec 21 16:30:48 2012 @@ -0,0 +1,64 @@ + + + + 4.0.0 + ctakes-assertion-zoner + Apache cTAKES Assertion's zoner + + org.apache.ctakes + ctakes + 3.1.0-incubating-SNAPSHOT + + + + + + net.sf.mastif + mastif-zoner + 1.5-SNAPSHOT + + + org.apache.uima + uimaj-core + + + org.uimafit + uimafit + + + org.apache.ctakes + ctakes-core + + + + + + + org.cleartk + jcasgen-maven-plugin + + src/main/resources/org/apache/ctakes/assertion/zoner/types/TypeSystem.xml + + + + + Added: incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml.backup01 URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml.backup01?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml.backup01 (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/pom.xml.backup01 Fri Dec 21 16:30:48 2012 @@ -0,0 +1,251 @@ + + + + 4.0.0 + ctakes-assertion-zoner + Apache cTAKES Assertion + + org.apache.ctakes + ctakes + 3.1.0-incubating-SNAPSHOT + + + + + + net.sf.mastif + mastif-zoner + 1.4 + + + + org.apache.uima + uimaj-core + + + + + + + + org.cleartk + jcasgen-maven-plugin + + src/main/resources/org/apache/ctakes/assertion/zoner/types/TypeSystem.xml + + + + + Added: incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/RunZoner.java URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/RunZoner.java?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/RunZoner.java (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/RunZoner.java Fri Dec 21 16:30:48 2012 @@ -0,0 +1,163 @@ +package org.mitre.medfacts.uima; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.IOException; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; +import java.util.logging.Logger; + +import org.apache.ctakes.core.util.CtakesFileNamer; +import org.apache.ctakes.core.ae.DocumentIdPrinterAnalysisEngine; +import org.apache.ctakes.core.cr.XMIReader; +import org.apache.uima.UIMAException; +import org.apache.uima.analysis_engine.AnalysisEngineDescription; +import org.apache.uima.collection.CollectionReader; +import org.apache.uima.resource.DataResource; +import org.apache.uima.resource.ExternalResourceDescription; +import org.apache.uima.resource.ResourceInitializationException; +import org.apache.uima.resource.SharedResourceObject; +import org.apache.uima.resource.metadata.TypeSystemDescription; +import org.uimafit.component.xwriter.XWriter; +import org.uimafit.factory.AggregateBuilder; +import org.uimafit.factory.AnalysisEngineFactory; +import org.uimafit.factory.CollectionReaderFactory; +import org.uimafit.factory.ExternalResourceFactory; +import org.uimafit.factory.TypeSystemDescriptionFactory; +import org.uimafit.factory.UimaContextFactory; +import org.uimafit.pipeline.SimplePipeline; + +public class RunZoner +{ + private static Logger logger = Logger.getLogger(RunZoner.class.getName()); + + File inputDirectory; + List inputFiles; + + File outputDirectory; + + public static void main(String args[]) throws UIMAException, IOException, URISyntaxException + { + if (args.length != 2) + { + System.err.format("Syntax: %s input_directory output_directory%n", RunZoner.class.getName()); + } + + File inputDirectory = new File(args[0]); + File outputDirectory = new File(args[1]); + + List inputFiles = listContents(inputDirectory); + + RunZoner runner = new RunZoner(); + runner.setInputDirectory(inputDirectory); + runner.setInputFiles(inputFiles); + runner.setOutputDirectory(outputDirectory); + + runner.execute(); + } + + public static List listContents(File inputDirectory) + { + File fileArray[] = inputDirectory.listFiles(new FilenameFilter() + { + + @Override + public boolean accept(File dir, String name) + { + return name.endsWith(".xmi"); + } + }); + + List fileList = Arrays.asList(fileArray); + return fileList; + } + + public void execute() throws UIMAException, IOException, URISyntaxException + { + AggregateBuilder builder = new AggregateBuilder(); + + TypeSystemDescription typeSystemDescription = TypeSystemDescriptionFactory.createTypeSystemDescriptionFromPath(); + + CollectionReader reader = + CollectionReaderFactory.createCollectionReader( + XMIReader.class, + typeSystemDescription, + XMIReader.PARAM_FILES, + inputFiles); + + AnalysisEngineDescription documentIdPrinter = + AnalysisEngineFactory.createPrimitiveDescription(DocumentIdPrinterAnalysisEngine.class); + builder.add(documentIdPrinter); + + URI generalSectionRegexFileUri = + this.getClass().getClassLoader().getResource("org/mitre/medfacts/zoner/section_regex.xml").toURI(); +// ExternalResourceDescription generalSectionRegexDescription = ExternalResourceFactory.createExternalResourceDescription( +// SectionRegexConfigurationResource.class, new File(generalSectionRegexFileUri)); + AnalysisEngineDescription zonerAnnotator = + AnalysisEngineFactory.createPrimitiveDescription(ZoneAnnotator.class, + ZoneAnnotator.PARAM_SECTION_REGEX_FILE_URI, + generalSectionRegexFileUri + ); + builder.add(zonerAnnotator); + + URI mayoSectionRegexFileUri = + this.getClass().getClassLoader().getResource("org/mitre/medfacts/zoner/mayo_sections.xml").toURI(); +// ExternalResourceDescription mayoSectionRegexDescription = ExternalResourceFactory.createExternalResourceDescription( +// SectionRegexConfigurationResource.class, new File(mayoSectionRegexFileUri)); + AnalysisEngineDescription mayoZonerAnnotator = + AnalysisEngineFactory.createPrimitiveDescription(ZoneAnnotator.class, + ZoneAnnotator.PARAM_SECTION_REGEX_FILE_URI, + mayoSectionRegexFileUri + ); + builder.add(mayoZonerAnnotator); + + AnalysisEngineDescription xWriter = AnalysisEngineFactory.createPrimitiveDescription( + XWriter.class, + typeSystemDescription, + XWriter.PARAM_OUTPUT_DIRECTORY_NAME, + outputDirectory.toString(), + XWriter.PARAM_FILE_NAMER_CLASS_NAME, + CtakesFileNamer.class.getName() + ); + + builder.add(xWriter); + + logger.info("BEFORE RUNNING PIPELINE..."); + SimplePipeline.runPipeline(reader, builder.createAggregateDescription()); + logger.info("AFTER RUNNING PIPELINE...COMPLETED"); + } + + public File getInputDirectory() + { + return inputDirectory; + } + + public void setInputDirectory(File inputDirectory) + { + this.inputDirectory = inputDirectory; + } + + public List getInputFiles() + { + return inputFiles; + } + + public void setInputFiles(List inputFiles) + { + this.inputFiles = inputFiles; + } + + public File getOutputDirectory() + { + return outputDirectory; + } + + public void setOutputDirectory(File outputDirectory) + { + this.outputDirectory = outputDirectory; + } + +} Added: incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/ZoneAnnotator.java URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/ZoneAnnotator.java?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/ZoneAnnotator.java (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/ZoneAnnotator.java Fri Dec 21 16:30:48 2012 @@ -0,0 +1,136 @@ +package org.mitre.medfacts.uima; + +import java.io.IOException; +import java.net.URI; +import java.util.Iterator; +import java.util.List; +import java.util.logging.Logger; + +import org.apache.ctakes.assertion.zoner.types.Heading; +import org.apache.ctakes.assertion.zoner.types.Subzone; +import org.apache.ctakes.assertion.zoner.types.Zone; +import org.apache.uima.UimaContext; +//import org.apache.uima.analysis_component.JCasAnnotator_ImplBase; +import org.apache.uima.analysis_engine.AnalysisEngineProcessException; +import org.apache.uima.jcas.JCas; +import org.apache.uima.resource.DataResource; +import org.apache.uima.resource.ResourceAccessException; +import org.apache.uima.resource.ResourceInitializationException; +import org.apache.uima.resource.SharedResourceObject; + +//import org.mitre.medfacts.zoner.*; +import org.mitre.medfacts.zoner.ZonerCli; +import org.mitre.medfacts.zoner.ZonerCli.HeadingRange; +import org.mitre.medfacts.zoner.ZonerCli.Range; +//import org.mitre.medfacts.zoner.ZonerCliSimplified; +//import org.mitre.medfacts.zoner.ZonerCliSimplified.HeadingRange; +//import org.mitre.medfacts.zoner.ZonerCliSimplified.Range; +import org.uimafit.component.JCasAnnotator_ImplBase; +import org.uimafit.descriptor.ConfigurationParameter; +import org.uimafit.descriptor.ExternalResource; +import org.uimafit.descriptor.TypeCapability; + +@TypeCapability(outputs = +{ + "org.apache.ctakes.assertion.zoner.types.Zone", + "org.apache.ctakes.assertion.zoner.types.Zone:label", + "org.apache.ctakes.assertion.zoner.types.Subzone", + "org.apache.ctakes.assertion.zoner.types.Subzone:label", + "org.apache.ctakes.assertion.zoner.types.Heading", + "org.apache.ctakes.assertion.zoner.types.Heading:label" +}) + +public class ZoneAnnotator extends JCasAnnotator_ImplBase { + public static final String PARAM_SECTION_REGEX_FILE_URI = "SectionRegex"; + + @ConfigurationParameter( + name = PARAM_SECTION_REGEX_FILE_URI, + description = "xml configuration file with zone regular expression values", + mandatory = true) + protected URI sectionRegexFileUri; + + protected final Logger logger = Logger.getLogger(ZoneAnnotator.class.getName()); + + //private ZonerCliSimplified zonerCli; + + @Override + public void initialize (UimaContext aContext) throws ResourceInitializationException { + super.initialize(aContext); + // Create ZonerCli using the resource URI +// URI uri; +// try { +// uri = getContext().getResourceURI(PARAM_SECTION_REGEX_FILE_NAME); +// } catch (ResourceAccessException e) { +// e.printStackTrace(); +// throw new ResourceInitializationException(e); +// } + //zonerCli = new ZonerCliSimplified(sectionRegexFileUri); + + } + + private int countOfIndexOutOfBounds = 0; + + @Override + public void process(JCas jcas) throws AnalysisEngineProcessException { +// ZonerCliSimplified zonerCli = +// new ZonerCliSimplified(sectionRegexFileUri); + ZonerCli zonerCli = + new ZonerCli(sectionRegexFileUri); + + zonerCli.setEntireContents(jcas.getDocumentText()); + // initialize converter once contents are set + zonerCli.initialize(); + try { + zonerCli.execute(); + } catch (IOException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + return; + //throw new AnalysisEngineProcessException(e); + } catch (StringIndexOutOfBoundsException e) { + // TODO Auto-generated catch block + e.printStackTrace(); + System.out.format("string index out of bounds exception count: %d%n", ++countOfIndexOutOfBounds); + return; + //throw new AnalysisEngineProcessException(e); + } + // Add the zone annotations + List rangeList = zonerCli.getRangeList(); + for (Iterator i = rangeList.iterator(); i.hasNext(); ) { + Range r = i.next(); + Zone zAnnot = new Zone(jcas); + zAnnot.setBegin(r.getBegin()); + zAnnot.setEnd(r.getEnd()); + zAnnot.setLabel(r.getLabel()); + zAnnot.addToIndexes(); + logger.info(String.format("added new zone annotation [%d-%d] \"%s\"", zAnnot.getBegin(), zAnnot.getEnd(), zAnnot.getCoveredText())); + } + + List subsectionRangeList = zonerCli.getSubsections(); + for (Iterator i = subsectionRangeList.iterator(); i.hasNext(); ) { + Range r = i.next(); + Subzone sAnnot = new Subzone(jcas); + sAnnot.setBegin(r.getBegin()); + sAnnot.setEnd(r.getEnd()); + sAnnot.setLabel(r.getLabel()); + sAnnot.addToIndexes(); + logger.info(String.format("added new subzone annotation [%d-%d] \"%s\"", sAnnot.getBegin(), sAnnot.getEnd(), sAnnot.getCoveredText())); + } + + + + // Add the heading annotations + List headings = zonerCli.getHeadings(); + for (Iterator i = headings.iterator(); i.hasNext(); ) { + HeadingRange r = i.next(); + Heading hAnnot = new Heading(jcas); + hAnnot.setBegin(r.getHeadingBegin()); + hAnnot.setEnd(r.getHeadingEnd()); + hAnnot.setLabel(r.getLabel()); + hAnnot.addToIndexes(); + logger.info(String.format("added new headingrange annotation [%d-%d] \"%s\"", hAnnot.getBegin(), hAnnot.getEnd(), hAnnot.getCoveredText())); + } + + } + +} Propchange: incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/java/org/mitre/medfacts/uima/ZoneAnnotator.java ------------------------------------------------------------------------------ svn:executable = * Added: incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/META-INF/org.uimafit/types.txt URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/META-INF/org.uimafit/types.txt?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/META-INF/org.uimafit/types.txt (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/META-INF/org.uimafit/types.txt Fri Dec 21 16:30:48 2012 @@ -0,0 +1 @@ +classpath*:org/apache/ctakes/assertion/zoner/types/TypeSystem.xml Added: incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/zoner/types/TypeSystem.xml URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/zoner/types/TypeSystem.xml?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/zoner/types/TypeSystem.xml (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/apache/ctakes/assertion/zoner/types/TypeSystem.xml Fri Dec 21 16:30:48 2012 @@ -0,0 +1,66 @@ + + + + org.apache.ctakes.assertion.types.TypeSystem + + 1.0 + + + + org.apache.ctakes.assertion.zoner.types.Heading + a section heading + uima.tcas.Annotation + + + label + + uima.cas.String + + + + + org.apache.ctakes.assertion.zoner.types.Zone + A document Zone, including its heading + uima.tcas.Annotation + + + label + + uima.cas.String + + + + + org.apache.ctakes.assertion.zoner.types.Subzone + + uima.tcas.Annotation + + + label + + uima.cas.String + + + + + + Added: incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.04162012.xml URL: http://svn.apache.org/viewvc/incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.04162012.xml?rev=1425006&view=auto ============================================================================== --- incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.04162012.xml (added) +++ incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.04162012.xml Fri Dec 21 16:30:48 2012 @@ -0,0 +1,641 @@ + + + + + + admit + (?:admit(?:ting)?|admission) + + + be + (\s+(?:am|is|are|was|were)) + + + left-time-mods + ((?:(?:pre-)?|current|discharge|follow[\s+\-]?up|home|interim|initial|postpartum|(?:pre|post)op(?:erative)?|transfer)\s+) + + + left-descr-mods + ((?:brief(?:\s+resume\s+of)?|overall|other|additional|pertinent|relevant|(?:list\s+of\s+)?other|plan\s+for|standardized|summary\s+of|doctor(?:\s+\')?s)\s+) + + + include + (\s+(?:include[ds]?)) + + + right-time-mods + (\s+(?:(?:(?:UP)?ON|AS|AT|\@|OF|FOR|(?:PRIOR\s+TO)|(?:DURING))\s+(?:(?:(?:THE|THAT|THIS)\s+)?(?:TIME|DAY)\s+OF\s+)?(?:ADMISSION|ADMIT|DEATH|DISCHARGE|TRANSFER|COMPLICATIONS?|HOME|HOSPITALIZATION|OUTPATIENT))) + + + right-descr-mods + (\s+(?:(?:by\s+(?:problems?|(?:organ\s+)?systems?|issues?|report))|(?:\s+of\s+note))) + + + right-location-mods + (\s+in\s+the\s+emergency\s+department) + + + + + allergies1 + (?:allergies|allergy|adverse\s+drug\s+reactions?) + + + allergies2 + (allergy(\s*\/\s*reaction)?\s+profile) + + + famhistory + (?:family\s+(?:(?:medical|social)\s+)?history|fhx?|fam\s+hx) + + + sochistory + (social\s+history|social\s+hx|\bshx?|soc\s*hx|habits) + + + + medication1 + (medications?|medication\s*\(\s*s\s*\)|regimen|treatment\s+cycle) + + + medication2 + (meds?) + + + + physicalexam1 + ((physical|surgical)(\s+exam(ination)?|\s+findings)?|\bpe|\*pe|on\s+exam(ination)?|examination\s+data) + + + physicalexam2 + ((daily|patient)\s+status) + + + + + + + + time + (\d?\d\:\d\d[aApP][mM]) + + + mayodate + (([0123])?\d(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)(\d\d)?\d\d\s+) + + + mayodate2 + ([01]?\d/\d?\d/(\d\d)?\d\d\s+) + + + + + + +
+ ((addendum)\s*\:) + +
+ +
+ (?:(?:(?:admission\s+date)|(date\s+of\s+admission))\s*\:) + +
+ +
+ age\s*\: + +
+ +
+ (?:(?:|)\s*\:) + +
+ +
+ ((()|())?(appointments?(?:\(?:\s+s\s+\))?)\s*(?:\(\s+s\s+\))?\s*\:) + +
+ +
+ ((|)?(assessment)()?\s*\:) + +
+ +
+ ((()|)?((?:assessment\s*(?:and|\&|\/)\s*plan)|(?:a\s*(?:\&|\/)\s*p))()?\s*\:) + +
+ +
+ (?:attend(?:ing)?(?:\s+physician)?\s*\:) + +
+
+ ([A-Z]{2,2}\d{3,3}\/\d{4,4}\s+[A-Z]+\s+([A-Z]\.\s+)?[A-Z]+\s+\,(?:\s+(?:jr|iii)\s*\,\s*)?\s+M\.D\.)\s*\: + +
+ +
+ (?:(?:chief\s+complaint)|(?:patient\s+states\s+complaint)\s*\:) + +
+ +
+ ((clinical\s+findings?)\s*\:) + +
+ +
+ (?:((code(?:\s+status)?)\s*\:)) + +
+ +
+ (((?:additional\s+)comments)\s*\:) + +
+ +
+ (?:((complications?)\s*\:)) + +
+ +
+ (()?(condition)()?\s*\:) + +
+ +
+ ((consultants?)\s*\:) + +
+ +
+ ((consults)\s*\:) + +
+ +
+ date\s+of\s+birth\s*\: + +
+ +
+ date\s+of\s+expiration\s*\: + +
+ +
+ (((?:principal\s+discharge\s+diagnosis\s*;\s*responsible\s+after\s+study\s+for\s+causing\s+admission\s*\)\s?))\s?) + +
+
+ ((?:other\s+diagnosis\s*;\s*conditions\s*\,\s*infections\s*,\s*complications\s*,\s*affecting\s+treatment\s*\/?\s*stay)\s?) + +
+
+ (((?:cause\s+of\s+death\s+and\s+)?((?:final|principal|principle|primary|associated|secondary|additional|other)\s+)?(((?:pre-)?admission|admit(?:ting)?|discharge|(?:pre|post)?operative)\s+)?(?:problems\s+and\s+)?diagnos[ei]s(?:\s*\(\s*es\s*\))?)()?\s*\:) + +
+ +
+ ((?:(?:(?:entered|dictated)\s+by\b)|dictator)\s*\:) + +
+
+ ((?:(?:[A-Z]+\.?)\s+)*[A-Z]+\s*\,\s+M\.D\.\s+DICTATING\s+FOR\s*\:) + +
+
+ (This\s+report\s+was\s+created\s+by\s+[A-Z]) + +
+ +
+ (()?(diet)\s*\:) + +
+ +
+ (((?:discharge\s+)?activity)\s*\:) + +
+
+ (^((?:discharge\s+)?activity)\s*)(\n) + +
+ +
+ ((discharge\s+date)|(date\s+of\s+discharge))\s*\: + +
+ +
+ ((discharge\s+patient\s+on)\s*\:) + +
+ +
+ (((?:doctor\'?s\s+)?discharge\s+orders)\s*\:) + +
+ +
+ (ed\s+discharge\s+notification\s*\/\s*summary)(\s+) + +
+ +
+ (()?(disposition)()?\s*\:) + +
+ +
+ (()?((?:ed\s+course)|(?:emergency\s+department\s+course)|(?:in\s+(?:the\s+)?ed|\bed))?(\s*\:|\*+)) + +
+ +
+ evaluation\s+date\s*\: + +
+ +
+ ((events)\s*\:) + +
+ +
+ (?:(?:)\s*\:) + +
+
+ (?:(?:FAMILY\s+HISTORY\s+[a-z]+)) + +
+ +
+ ((discharged?\s+wound\s+care)\s*\:) + +
+
+ (((?:discharged?\s+follow[\s\-]?up(?:\s+care)?)|((?:standardized\s+)?discharge\s+orders\s+\(\s+medications\s+instructions\s+to\s+patient\s+\,\s+follow-up\s+care\s+\))|(disposition\s*\,\s+follow[\s+\-]up\s+and\s+instructions\s+to\s+patient)|(?:follow[\s\-]?up))\s*\:) + +
+
+ ((((?:discharged?\s+follow[\s\-]?up(?:\s+care)?))|((?:standardized\s+)?discharge\s+orders\s+\(\s+medications\s+instructions\s+to\s+patient\s+\,\s+follow-up\s+care\s+\))|(disposition\s*\,\s+follow[\s+\-]up\s+and\s+instructions\s+to\s+patient)|(?:follow[\s\-]?up))\s*)(\n) + +
+ +
+ sex\s*\: + +
+ +
+ ^(health\s+status)\s*(\n) + +
+ +
+ (()?(?:history\s+(?:of\s+(?:the\s+)?)?(?:present|physical)\s+illness|hpi)\s+and\s+(?:(?:history\s+and\s+)?reasons?\s+(?:for\s+)?(?:hospitalization|admission|admit))\s*\:) + +
+
+ ((?:(?:history\s+and\s+)?reasons?\s+(?:for\s+)?(?:hospitalization|admission|admit))\s*\:) + +
+
+ (()?(((history\s+(of\s+(the\s+)?)?)?(present|physical)\s+illness)|hpi)\s*\:) + +
+
+ (((?:history\s+(?:of\s+(?:the\s+)?)?(?:present|physical)\s+illness)|hpi)\s*)\n + +
+
+ (^()?((?:clinical\s+)?history)\s*\:) + +
+ +
+ (()?(?:hosp(?:ital)?\s+course(?:\s+and\s+treatment)?)|(?:summary\s+of\s+hospitalization)?(\s*\:|\*+|\-+)) + +
+
+ (()?(?:hosp(?:ital)?\s+course(?:\s+and\s+treatment)?)|(?:summary\s+of\s+hospitalization)?)(\s*\n) + +
+
+ (\*+(hosp(?:ital)?\s+course(?:\s+and\s+treatment)?)|(?:summary\s+of\s+hospitalization)) + +
+
+ ((hosp(?:ital)?\s+course(\s+and\s+treatment)?)|(summary\s+of\s+hospitalization)\s+([a-z]+|\()) + +
+ +
+ ((impressions?|imp)()?\s*\:) + +
+ +
+ ((?:impression\s+and\s+plan)\s*\:) + +
+ +
+ ((?:impression\s+and\s+plan)\s*)(\n) + +
+ +
+ ((()|())+(instructions?)()?\s*\:) + +
+
+ (^(instructions?)()?\s*)(\n) + +
+ +
+ ((?:()|())?((?:summary\s+of\s+)?(?:(?:diagnostic\s+)?(?:laboratory|labs?|laboratories|diagnostic|radiologic)(?:(?:\s+and\s+)|(?:\/))?\s*(?:radiologic\s+)?(?:studies|data|results|exams?(?:inations?)?|evaluation|values|findings|tests?|x\-rays?)?)|(?:tests\/procedures|(?:studies\/(?:procedures|tests))|(?:proc(?:edures)?\/tests)|tests))()?\s*\:) + +
+
+ (((?:new|discrete|(?:today\s+\'s))\s+results(?:\s+only)?)\s*\:) + +
+
+ (((?:new|discrete|(?:today\s+\'s))\s+results(?:\s+only)?)\s*)\n + +
+
+ (()\s*data()?\s*\:) + +
+
+ ((?:\-+|\*+|\_+)\s*)(data()?\s*\:) + +
+ +
+ ()?()()?(|)?\s*\: + +
+
+ ^()?()()?(|)?\s*\n + +
+
+ ()()()?(|?\s*\:) + +
+
+ ^()()?(|?\s*\:) + +
+
+ (()?()\s+on\s+transfer\s+from(\s+[A-Z]+)+(|)?\s*\:) + +
+
+ (MEDICATIONS()?)(\s+[a-z]+) + +
+ +
+ (((?:(?:past\s+)?ob\-gyn\s+history))\s*\:) + +
+
+ (((?:past\s+obstetric\s+history))\s*\:) + +
+
+ ^(((?:past\s+gyn(?:ocologic(?:al)?)?\s+history))\s*\:) + +
+ +
+ (?:objective\s*\:?\n) + +
+
+ (?:^o\s*\:\n) + +
+ +
+ (((?:((?:principal|principle|primary|special)\s+)?(?:operations?\s+(?:and|or)\s+)?procedures?(\s+(?:and|or)\s+operations?)?(?:\s+performed)?)|(operations\s*\/\s*procedures)|(?:operations?))\s*\:) + +
+
+ ((((?:associated|secondary|additional|other)\s+)(?:operations?\s+(?:and|or)\s+)?procedures?(\s+(?:and|or)\s+operations?)?)\s*\:) + +
+
+ ((procedure\s+note)\s*\:) + +
+
+ ((procedure\s+in\s+detail)\s*\:) + +
+
+ ((postpartum\s+diagnostic\s+procedures)\s*\:) + +
+
+ ((postpartum\s+therapeutic\s+procedures)\s*\:) + +
+
+ ((other\s+postpartum\s+therapies)\s*\:) + +
+
+ ^procedure:\s*$ + +
+ +
+ ((past\s+cardiac\s+history)\s*\:) + +
+ +
+ (past\s+(medical\s+)?history|medical\s+history|pmhx?|pmedhx)\s*\: + +
+
+ ((past\s+(?:medical\s+)?history|medical\s+history|pmhx?|pmedhx)\/()\s*\:) + +
+ +
+ (past\s+surgical\s+history)\s*\: + +
+ +
+ (\*+\s+(FINAL\s+)?DISCHARGE\s+ORDERS\s+\*+) + +
+ +
+ (()?()?(|)()??(?:\s*\:|\-[^a-z])) + +
+
+ (()?()?(|)()?()?)((\s+vss?)(?:\s*\:|\-[^a-z])) + +
+
+ ((?:\-+|\*+|\_+|())\s*(?:exam(?:ination)?(?:\s+data)?()\s*\:)) + +
+
+ ^(()(?:exam(?:ination)?)(?:\s*\:)) + +
+
+ ^(()?()?(|))(\n) + +
+ +
+ (?:^(()?(\bplans?)(\s*\:[^\d]|\s*\*))) + +
+
+ (?:^(()?(\bplans?))(\s*\n)) + +
+ +
+ (((?:disposition\s*(?:and|\&|\/)\s*plan)|(?:disposition\s*(?:\&|\/)plan))()?\s*\:) + +
+ +
+ (((?:to\s+do\s*\/\s*plans?))\s*\:) + +
+ +
+ (($pch)\s*\:)(\s+) + +
+ +
+ ((preoperative\s+(?:cardiac\s+)?status)\s*\:) + +
+ +
+ ((previous\s+cardiovascular\s+interventions?)\s*\:) + +
+ +
+ (?:pcp\s+name\s*\:) + +
+ +
+ (((?:(?:patient\s+)?problem\s+list)|(?:significant\s+problems)|(?:(?:(?:his|her)\s+)?problems\s+and\s+management\s+are\s+as\s+follows) )?\s*[\:;\*+]) + +
+ +
+ (?:(?:radiology(?:\s+(?:studies|data|results))?)\s*\:) + +
+ +
+ (?:reason\s+(?:for\s+)?consultation\s*\:) + +
+ +
+ ((report\s+status)\s*\:) + +
+ +
+ (?:(review\s+(?:of\s+)?systems|ros)\s*\:) + +
+
+ (?:(REVIEW\s+OF\s+SYSTEMS)(\s+[a-z]+)) + +
+ +
+ (?:service\s*\:) + +
+ +
+ (?:^()\s*\:) + +
+
+ (?:((SOCIAL\s+HISTORY|HABITS))(\s+[a-z]+)) + +
+ +
+ (?:subjective\s*\:?\n) + +
+
+ (?:^s\s*\:\n) + +
+ +
+ (((?:escription\s+document)|batch)\s*\:) + +
+ +
+ (((?:treatment\s+rendered))\s*\:?) + +
+
+ ((((?:other\s+treatments?\s*\/\s*procedures\s+\(\s+not\s+in\s+o\.r\.\s+\)))|(other\s+treatments?\s+and\s+procedures))\s*\:?) + +
+ +
+ (?:unit\s+number\s*\:) + +
+ +
+ ((return\s+to\s+work)\s*\:) + +
+ +
+ (?:\s+Exam\:) + +
+
+ (?:ORIGINAL REPORT) + +
+
+ (?:\s+\*+\s+Final\s+\*+) +
+
+ (?:Indications\:) + +
+
+
+ Propchange: incubator/ctakes/trunk/ctakes-assertion-zoner/src/main/resources/org/mitre/medfacts/uima/section_regex.04162012.xml ------------------------------------------------------------------------------ svn:executable = *