ctakes-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jay vyas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CTAKES-314) BigTop/Hadoop cTAKES integration
Date Sat, 11 Oct 2014 15:44:34 GMT

    [ https://issues.apache.org/jira/browse/CTAKES-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168211#comment-14168211
] 

jay vyas commented on CTAKES-314:
---------------------------------

 [~chenpei] ... *Getting close !* But only *one term is being printed* :( ...  the {{_InitialView}}
term.  

- I dont see any other medical terms.  
- Ive also added alot of new terms to the documentText with expectation that that might have
helped catch a few.  
- Any thoughts on how I can get some useful terms ?  

Here is my code: 
{noformat}

  def getDefaultPipeline():AnalysisEngineDescription = {
    def builder = new AggregateBuilder
    builder.add(AnalysisEngineFactory.createPrimitiveDescription(classOf[CopyNPChunksToLookupWindowAnnotations]));
    builder.add(AnalysisEngineFactory.createPrimitiveDescription(classOf[RemoveEnclosedLookupWindows]));
    builder.add(UmlsDictionaryLookupAnnotator.createAnnotatorDescription());
    builder.add(createAnnotatorDescription());
    builder.add(PolarityCleartkAnalysisEngine.createAnnotatorDescription());
    builder.add(SimpleSegmentAnnotator.createAnnotatorDescription());
    builder.add(TokenizerAnnotatorPTB.createAnnotatorDescription());
    builder.add(ContextDependentTokenizerAnnotator.createAnnotatorDescription());
    builder.add(createAnnotatorDescription());
    return builder.createAggregateDescription();
  }
  def main(args: Array[String]) {

    val aed:AnalysisEngineDescription = getDefaultPipeline();
      val jcas:JCas = JCasFactory.createJCas();
      jcas.setDocumentText("The patient is suffering from extreme pain due to shark bite.
Recommend continuing use of aspirin, oxycodone, and coumadin. atient denies smoking and chest
pain. Patient has no cancer. There is no sign of multiple sclerosis. Continue exercise for
obesity and hypertension. ");

      SimplePipeline.runPipeline(jcas, aed);

      //Print out the tokens and Parts of Speech

     //val iter = JCasUtil.select(jcas, classOf[BaseToken]).iterator()
      val iter = JCasUtil.selectAll(jcas).iterator();


      while(iter.hasNext){
        val entity = iter.next();
        System.out.println("Token: " + entity.toString()  + " " );
      }

  }
{noformat}

Now, here is the output 

{noformat} 
/usr/lib/jvm/java-1.7.0-openjdk/bin/java -Didea.launcher.port=7534 -Didea.launcher.bin.path=/opt/idea-IU-135.1230/bin....
com.intellij.rt.execution.application.AppMain sparkapps.CTakesExample
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Level value for root is  [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{dd MMM yyyy HH:mm:ss} %5p %c{1} - %m%n].
log4j: Adding appender named [consoleAppender] to category [root].
Token: DocumentAnnotation
   sofa: _InitialView
   begin: 0
   end: 273
   language: "x-unspecified"
 

Process finished with exit code 0
{noformat} 

> BigTop/Hadoop cTAKES integration
> --------------------------------
>
>                 Key: CTAKES-314
>                 URL: https://issues.apache.org/jira/browse/CTAKES-314
>             Project: cTAKES
>          Issue Type: New Feature
>    Affects Versions: 3.2.0
>            Reporter: Pei Chen
>             Fix For: 3.2.3
>
>         Attachments: Napkin_cTAKES_Hadoop.JPG
>
>
> Placeholder to-
> Create a simple application that can take in different datasources (public forums, twitter,
etc.), scale up cTAKES using BigTop/Hadoop ecosystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message