ctakes-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chen...@apache.org
Subject svn commit: r1635170 - in /ctakes/sandbox: ./ org/ org/apache/ org/apache/ctakes/ org/apache/ctakes/dictionary/ org/apache/ctakes/dictionary/lookup/ src/ src/main/ src/main/java/ src/main/resources/ src/main/scala/ src/main/scala/sparkapps/ src/main/sc...
Date Wed, 29 Oct 2014 15:13:48 GMT
Author: chenpei
Date: Wed Oct 29 15:13:47 2014
New Revision: 1635170

URL: http://svn.apache.org/r1635170
Log:
CTAKES-314 - Initial ctakes/spark/hadoop example into sandbox.  Thanks Jay Vyas for the contribution.

Added:
    ctakes/sandbox/README.md   (with props)
    ctakes/sandbox/build.sbt   (with props)
    ctakes/sandbox/org/
    ctakes/sandbox/org/apache/
    ctakes/sandbox/org/apache/ctakes/
    ctakes/sandbox/org/apache/ctakes/dictionary/
    ctakes/sandbox/org/apache/ctakes/dictionary/lookup/
    ctakes/sandbox/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml
    ctakes/sandbox/src/
    ctakes/sandbox/src/main/
    ctakes/sandbox/src/main/java/
    ctakes/sandbox/src/main/resources/
    ctakes/sandbox/src/main/scala/
    ctakes/sandbox/src/main/scala/sparkapps/
    ctakes/sandbox/src/main/scala/sparkapps/SparkApp1.scala   (with props)
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesExample.scala
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesTwitterStreamingApp.scala   (with
props)
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/Parser.scala
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/TermAnalyzer.scala
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterInputDStreamCTakes.scala
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterUtilsJ.scala
    ctakes/sandbox/src/main/scala/sparkapps/ctakes/Utils.scala
    ctakes/sandbox/src/test/
    ctakes/sandbox/src/test/java/
    ctakes/sandbox/src/test/resources/
    ctakes/sandbox/src/test/scala/
    ctakes/sandbox/src/test/scala/TestSpark.scala   (with props)
    ctakes/sandbox/src/test/scala/TestStreaming.scala
    ctakes/sandbox/twitter

Added: ctakes/sandbox/README.md
URL: http://svn.apache.org/viewvc/ctakes/sandbox/README.md?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/README.md (added)
+++ ctakes/sandbox/README.md Wed Oct 29 15:13:47 2014
@@ -0,0 +1,44 @@
+This is a spark SBT based application  which starts a stream of twitter data and process
it 
+using ctakes.  This application uses Spark Streaming, and the architecture is hardcoded (easily
modified)
+to start two local spark daemons.  One will do processing, and the other will launch a spark
thread that 
+reads from twitter - and will output "n" RDDs every second (for details, see the DStream
implementation 
+in this source code, of the sliding interval function).  
+
+This is a prototype to demonstrate how to scalably mine data from twitter and feed it into
the CTAkes API.
+
+AS IS , it is not immedieately useful and needs to be customized to the users needs.  We
will update it in the future
+with a few more options.
+
+Feedback is very welcome ! And so would any fixes or bug reports ! 
+
+- To run it, first get your app keys from https://apps.twitter.com/app/
+
+- Then, copy the "twitteR" file in this directory to /tmp/twitter (hardcoded for now).
+
+- After that, you can run the main method in the Driver class.   You can do this from your
IDE (intellij has great sbt support)
+or else, use SBT's run function, which inspects for main classes for you (see the comment
stream in CTAKES-314 for an example).
+
+The final output will look something like this:
+
+28 Oct 2014 18:26:34  INFO SentenceDetector - Sentence detector model file: org/apache/ctakes/core/sentdetect/sd-med-model.zip
+28 Oct 2014 18:26:34  INFO SentenceDetector - Starting processing.
+28 Oct 2014 18:26:34  INFO TokenizerAnnotatorPTB - process(JCas) in org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
+28 Oct 2014 18:26:34  INFO ContextDependentTokenizerAnnotator - process(JCas)
+
+Interleaved , you will see some very rudimentary CTakes text analyses with POS tagging: 
+
+------"------Oct NNP------28 CD------, ,------2014 CD------6 CD------: :------25 CD------:
:------05 CD------PM NN------" ''------, ,------"  ------id NN------" ''------: :------527224559947509762
CD------, ,------"
+
+
+Finally, at the end, you should see that We've collected enough tweets (the constant amount
of tweets to collect before killing the spark slaves you can change in the code.
+
+
+ PROGRESS ::: 17 so far, out of 10 
+
+
+28 Oct 2014 18:34:40  INFO ReceiverTracker - Sent stop signal to all 1 receivers
+
+Final notes: This is an example and proof of concept of how to glue CTakes into a big data
framwork such as Spark, and is just a first iteration.  It is not meant to run in production,
but rather, as a local sandbox for building a twitter stream which uses CTakes to process
medical terms along with spark streaming.  In the future, by procuring a twitter firehose,
and modifying some of the core parameters in the Driver class, it is expected that CTakes
can evolve some various BigData related data streams.
+
+
+

Propchange: ctakes/sandbox/README.md
------------------------------------------------------------------------------
    svn:executable = 

Added: ctakes/sandbox/build.sbt
URL: http://svn.apache.org/viewvc/ctakes/sandbox/build.sbt?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/build.sbt (added)
+++ ctakes/sandbox/build.sbt Wed Oct 29 15:13:47 2014
@@ -0,0 +1,40 @@
+name := "SparkSBT"
+
+version := "1.0"
+
+scalaVersion := "2.10.4"
+
+
+libraryDependencies += "org.apache.spark" %% "spark-core" % "1.1.0"
+
+libraryDependencies +=  "org.scalatest" % "scalatest_2.10.0-M4" % "1.9-2.10.0-M4-B1"
+
+libraryDependencies +=  "junit" % "junit" % "4.8.1" % "test"
+
+libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.1.0"
+
+libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.1.0"
+
+libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.1.0"
+
+libraryDependencies += "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0"
+
+libraryDependencies += "com.google.code.gson" % "gson" % "2.3"
+
+libraryDependencies += "org.twitter4j" % "twitter4j-core" % "3.0.3"
+
+libraryDependencies += "commons-cli" % "commons-cli" % "1.2"
+
+libraryDependencies += "org.apache.ctakes" % "ctakes-core" % "3.2.0"
+
+libraryDependencies += "org.apache.ctakes" % "ctakes-core-res" % "3.2.0"
+
+libraryDependencies += "org.apache.ctakes" % "ctakes-constituency-parser" % "3.2.0"
+
+libraryDependencies += "org.apache.ctakes" % "ctakes-clinical-pipeline" % "3.2.0"
+
+resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
+
+resolvers += "opennlp sourceforge repo" at "http://opennlp.sourceforge.net/maven2"
+
+resolvers += "Sonatype Snapshots" at "https://oss.sonatype.org/content/repositories/releases/"

Propchange: ctakes/sandbox/build.sbt
------------------------------------------------------------------------------
    svn:executable = 

Added: ctakes/sandbox/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml
URL: http://svn.apache.org/viewvc/ctakes/sandbox/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml (added)
+++ ctakes/sandbox/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml Wed Oct 29 15:13:47
2014
@@ -0,0 +1,102 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+-->
+<lookupSpecification>
+	<!--  Defines what dictionaries will be used in terms of implementation specifics and
metaField configuration. -->
+	<dictionaries>
+	
+		<dictionary id="DICT_UMLS_MS" externalResourceKey="DbConnection" caseSensitive="false">
+			<implementation>
+				<jdbcImpl tableName="umls_ms_2011ab"/>
+			</implementation>
+			<lookupField fieldName="fword"/>
+			<metaFields>
+				<metaField fieldName="cui"/>
+				<metaField fieldName="tui"/>
+				<metaField fieldName="text"/>
+			</metaFields>
+		</dictionary>
+	
+		<dictionary id="DICT_RXNORM" externalResourceKey="RxnormIndexReader" caseSensitive="false">
+			<implementation>
+				<luceneImpl/>
+			</implementation>
+			<lookupField fieldName="first_word"/>
+			<metaFields>
+				<metaField fieldName="code"/>
+				<metaField fieldName="codeRxNorm"/>
+				<metaField fieldName="preferred_designation"/>
+				<metaField fieldName="other_designation"/>
+			</metaFields>
+		</dictionary>
+	
+	</dictionaries>
+	<!-- Binds together the components necessary to perform the complete lookup logic start
to end. -->
+	<lookupBindings>
+	
+		<lookupBinding>
+			<dictionaryRef idRef="DICT_UMLS_MS"/>
+			<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
+				<properties>
+					<property key="textMetaFields" value="text"/>
+					<property key="maxPermutationLevel" value="7"/>
+					<!--	<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.Sentence"/>
-->
+					<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
 
+					<property key="exclusionTags" value="VB,VBD,VBG,VBN,VBP,VBZ,CC,CD,DT,EX,LS,MD,PDT,POS,PP,PP$,PRP,PRP$,RP,TO,WDT,WP,WPS,WRB"/>
+				</properties>
+			</lookupInitializer>
+			<lookupConsumer className="org.apache.ctakes.dictionary.lookup.ae.UmlsToSnomedDbConsumerImpl">
+				<properties>
+					<property key="codingScheme" value="SNOMED"/>
+					<property key="cuiMetaField" value="cui"/>
+					<property key="tuiMetaField" value="tui"/>
+					<property key="anatomicalSiteTuis" value="T021,T022,T023,T024,T025,T026,T029,T030"/>
+					<property key="procedureTuis" value="T059,T060,T061"/>
+					<property key="disorderTuis" value="T019,T020,T037,T046,T047,T048,T049,T050,T190,T191"/>
+					<property key="findingTuis" value="T033,T034,T040,T041,T042,T043,T044,T045,T046,T056,T057,T184"/>
+					<property key="dbConnExtResrcKey" value="DbConnection"/>
+					<property key="mapPrepStmt" value="select code from umls_snomed_map where cui=?"/>
+				</properties>
+			</lookupConsumer>
+		</lookupBinding>
+	
+		<lookupBinding>
+			<dictionaryRef idRef="DICT_RXNORM"/>
+			<lookupInitializer className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
+				<properties>
+					<property key="textMetaFields" value="preferred_designation|other_designation"/>
+					<property key="maxPermutationLevel" value="7"/>
+					<!--	<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.Sentence"/>
-->
+					<property key="windowAnnotations" value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
 
+					<property key="exclusionTags" value="VB,VBD,VBG,VBN,VBP,VBZ,CC,CD,DT,EX,LS,MD,PDT,POS,PP,PP$,RP,TO,WDT,WP,WPS,WRB"/>
+				</properties>
+			</lookupInitializer>
+			<lookupConsumer className="org.apache.ctakes.dictionary.lookup.ae.OrangeBookFilterConsumerImpl">
+				<properties>
+					<property key="codingScheme" value="RXNORM"/>
+					<property key="codeMetaField" value="codeRxNorm"/> <!-- Use value="code" for
UMLS CUIs -->
+					<property key="luceneFilterExtResrcKey" value="OrangeBookIndexReader"/>
+				</properties>
+			</lookupConsumer>
+		</lookupBinding>
+		
+	</lookupBindings>
+</lookupSpecification>

Added: ctakes/sandbox/src/main/scala/sparkapps/SparkApp1.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/SparkApp1.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/SparkApp1.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/SparkApp1.scala Wed Oct 29 15:13:47 2014
@@ -0,0 +1,45 @@
+package sparkapps
+
+import org.apache.spark.SparkContext
+import org.apache.spark.SparkContext._
+import org.apache.spark.SparkConf
+import org.apache.spark.rdd.RDD
+
+/**
+ * Created by apache on 7/20/14.
+ */
+object SparkApp1 {
+
+  def sparkJob() = {
+
+      val logFile = "/etc/passwd" // Should be some file on your system
+      val conf = new SparkConf()
+          .setAppName("Simple Application")
+          //this needs to be parameterized.
+          .setMaster("local")
+
+    val sc = new SparkContext(conf)
+
+      val logData = sc.textFile(logFile, 2).cache()
+
+      val numAs =
+        logData.filter(line => line.contains("a")).count()
+
+      val numBs =
+        logData.filter(line => line.contains("b")).count()
+
+      val piped = logData.pipe("grep a").collect();
+
+      println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
+    }
+
+  def main(args: Array[String]) {
+      if(args.length==0)
+        sparkJob();
+      else {
+        args(0) match   {
+        case "1" => sparkJob();
+        }
+      }
+  }
+}

Propchange: ctakes/sandbox/src/main/scala/sparkapps/SparkApp1.scala
------------------------------------------------------------------------------
    svn:executable = 

Added: ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesExample.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesExample.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesExample.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesExample.scala Wed Oct 29 15:13:47
2014
@@ -0,0 +1,60 @@
+package sparkapps.ctakes
+
+import java.text.BreakIterator
+
+import opennlp.tools.postag.POSTagger
+import opennlp.tools.sentdetect.{SentenceDetectorME, SentenceModel, SentenceDetector}
+import org.apache.ctakes.assertion.medfacts.cleartk.PolarityCleartkAnalysisEngine
+import org.apache.ctakes.clinicalpipeline.ClinicalPipelineFactory.{RemoveEnclosedLookupWindows,
CopyNPChunksToLookupWindowAnnotations}
+import org.apache.ctakes.constituency.parser.ae.ConstituencyParser
+import org.apache.ctakes.contexttokenizer.ae.ContextDependentTokenizerAnnotator
+import org.apache.ctakes.core.ae.{TokenizerAnnotatorPTB, SimpleSegmentAnnotator}
+import org.apache.ctakes.dependency.parser.ae.{ClearNLPSemanticRoleLabelerAE, ClearNLPDependencyParserAE}
+import org.apache.ctakes.dependency.parser.ae.ClearNLPDependencyParserAE._
+import org.apache.ctakes.dictionary.lookup.ae.UmlsDictionaryLookupAnnotator
+import org.apache.ctakes.typesystem.`type`.syntax.BaseToken
+import org.apache.ctakes.typesystem.`type`.textsem.IdentifiedAnnotation
+import org.apache.uima.analysis_engine.{AnalysisEngine, AnalysisEngineDescription}
+import org.apache.uima.jcas.JCas
+import org.cleartk.chunker.Chunker
+import org.uimafit.factory.{AnalysisEngineFactory, AggregateBuilder, JCasFactory}
+import org.uimafit.pipeline.SimplePipeline
+import org.uimafit.util.JCasUtil
+
+import scala.collection.JavaConverters._
+
+
+object CTakesExample {
+
+  def getDefaultPipeline():AnalysisEngine  = {
+      val builder = new AggregateBuilder
+      builder.add(SimpleSegmentAnnotator.createAnnotatorDescription());
+      builder.add(org.apache.ctakes.core.ae.SentenceDetector.createAnnotatorDescription());
+      builder.add(TokenizerAnnotatorPTB.createAnnotatorDescription());
+      builder.add(ContextDependentTokenizerAnnotator.createAnnotatorDescription());
+      builder.add(org.apache.ctakes.postagger.POSTagger.createAnnotatorDescription());
+      builder.add(org.apache.ctakes.chunker.ae.Chunker.createAnnotatorDescription());
+      builder.add(AnalysisEngineFactory.createPrimitiveDescription(classOf[CopyNPChunksToLookupWindowAnnotations]));
+      builder.add(AnalysisEngineFactory.createPrimitiveDescription(classOf[RemoveEnclosedLookupWindows]));
+      //builder.add(UmlsDictionaryLookupAnnotator.createAnnotatorDescription()); builder.add(PolarityCleartkAnalysisEngine.createAnnotatorDescription());
return builder.createAggregateDescription(); }
+      builder.createAggregate()
+    }
+
+  def main(args: Array[String]) {
+        val aed:AnalysisEngine= getDefaultPipeline();
+        val jcas:JCas = JCasFactory.createJCas();
+        jcas.setDocumentText("The patient is suffering from extreme pain due to shark bite.
Recommend continuing use of aspirin, oxycodone, and coumadin. atient denies smoking and chest
pain. Patient has no cancer. There is no sign of multiple sclerosis. Continue exercise for
obesity and hypertension. ");
+
+        SimplePipeline.runPipeline(jcas, aed);
+
+        //Print out the tokens and Parts of Speech
+
+        val iter = JCasUtil.select(jcas,classOf[BaseToken]).iterator()
+        System.out.println(iter.hasNext);
+        while(iter.hasNext)
+        {
+          val entity = iter.next();
+          System.out.println(entity.getCoveredText + " " + entity.getPartOfSpeech);
+        }
+  }
+}

Added: ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesTwitterStreamingApp.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesTwitterStreamingApp.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesTwitterStreamingApp.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesTwitterStreamingApp.scala Wed Oct
29 15:13:47 2014
@@ -0,0 +1,193 @@
+package sparkapps.ctakes
+
+import java.io.File
+
+import com.google.gson.Gson
+import com.google.gson._
+import jregex.Pattern
+import org.apache.spark.storage.StorageLevel
+import org.apache.spark.streaming.dstream.{DStream, ReceiverInputDStream}
+import org.apache.spark.streaming.twitter.TwitterUtils
+import org.apache.spark.streaming.{Seconds, StreamingContext}
+import org.apache.spark.{SparkConf, SparkContext}
+import scala.runtime.ScalaRunTime._
+/**
+ * Collect at least the specified number of tweets into json text files.
+ */
+object Driver {
+
+  private var numTweetsCollected = 0L
+  private var partNum = 0
+  private var gson = new Gson()
+
+  /**
+   * Maintain twitter credentials in a /tmp/twitter file.
+   * @param twitterParam
+   * @return
+   */
+  def readParameter(twitterParam:String) : Option[String] = {
+    if(! new File("/tmp/twitter").exists()){
+      System.err.println("MAJOR failure.  No /tmp/twitter file exists.")
+      None;
+    }
+    //find the parameter in /tmp/twitter.
+    def read(param:String):Option[String]= {
+      scala.io.Source.fromFile("/tmp/twitter").getLines().foreach {
+        x =>
+          System.out.println("line : " + x)
+          if (! x.contains("=")){
+            System.err.println("MAJOR failure.  Bad line " +x + " in /tmp/twitter.")
+          }
+          else if (x.contains(param)) {
+            return Some(x.split("=")(1));
+          }
+      }
+      System.err.println("Uhoh ! Didnt see the twitter param in /tmp/twitter for " + twitterParam
);
+      None
+    }
+
+    //just return it for now , in future maybe we'll handle errors by prompting user.
+    val x = read(twitterParam);
+    x match {
+      case Some(x) => return Some(x);
+      case None => {
+       //FUTURE : Prompt user for this parameter.
+       return None;
+      }
+    }
+  }
+  /**
+   * Input= --outputDirectory --numtweets --intervals --partitions
+   * Output= outputdir numtweets  intervals partitions consumerKey consumerSecret accessToken
accessTokenSecret
+   */
+  def main(args: Array[String]) {
+    def failTwFile() = {
+      System.err.println("FAILURE to read values from /tmp/twitter credentials. ")
+      System.err.println("Please write a k/v file like this:")
+      System.err.println("consumerKey=xxx")
+      System.err.println("consumerSecret=yyy")
+      System.err.println("accessToken=zzz")
+      System.err.println("accessTokenSecret=aaa")
+      System.err.println("To /tmp/twitter, and restart this app.")
+      System.exit(2)
+    }
+
+    System.out.println("START:  Put consumerkey,consumer_secret,access_token,access_token_secret
in /tmp/twitter, " +
+      "or it will be written for you interactively....")
+    if(args.length==0) {
+      val defs = Array(
+        "--outputDirectory", "/tmp/OUTPUT_" + System.currentTimeMillis(),
+        "--numtweets", "10",
+        "--intervals", "10",
+        "--partitions", "1",
+        //added as system properties.
+        /** qoute at the end is for type inference **/
+        "twitter4j.oauth." + Parser.CONSUMER_KEY, readParameter(Parser.CONSUMER_KEY).getOrElse({failTwFile();
""}),
+        "twitter4j.oauth." + Parser.CONSUMER_SECRET, readParameter(Parser.CONSUMER_SECRET).getOrElse({failTwFile();
""}),
+        "twitter4j.oauth." + Parser.ACCESS_TOKEN, readParameter(Parser.ACCESS_TOKEN).getOrElse({failTwFile()
; ""}),
+        "twitter4j.oauth." + Parser.ACCESS_TOKEN_SECRET, readParameter(Parser.ACCESS_TOKEN_SECRET).getOrElse({failTwFile();
""}));
+
+      //TODO clean up this.  Could lead to infinite recursion.
+      System.err.println("Usage: " + this.getClass.getSimpleName + " executing w/ default
options ! " + defs)
+      main(defs);
+      return;
+    }
+     /**
+     * Here we declare an array of values which map to the ordered.
+     * Each value (i.e. numTweetsToCollect) is a newly declared variable that is
+     * destructured from the parseCommandLineWithTwitterCredentials(args) monad.
+     */
+    val Array(
+    //alphabetical order returned by values.
+    Utils.IntParam(intervalSecs),
+    Utils.IntParam(numTweetsToCollect),
+    outputDirectory,
+    Utils.IntParam(partitionsEachInterval)) =
+      Parser.parse(args)
+
+    verifyAndRun(intervalSecs,numTweetsToCollect, new File(outputDirectory), partitionsEachInterval);
+  }
+
+  def verify() = {
+    /**
+     * Checkpoint confirms that each system property exists.
+     */
+    Utils.checkpoint(
+    //verifier
+    {
+      xp =>
+        System.getProperty(xp.toString) != null;
+    },
+    //error messages.
+    {
+      xp => System.err.println("Failure: " + xp)
+    },
+    //properties to be verified.
+    List(
+      "twitter4j.oauth.consumerKey",
+      "twitter4j.oauth.consumerSecret",
+      "twitter4j.oauth.accessToken",
+      "twitter4j.oauth.accessTokenSecret")
+    )
+  }
+
+  def verifyAndRun(intervalSecs:Int, numTweetsToCollect:Int, outputDirectory:File, partitionsEachInterval:Int)
= {
+
+    System.out.println(
+      "Params = seconds= " + intervalSecs +
+        " tweets= " + numTweetsToCollect + ", " +
+        " out =" + outputDirectory + ", " +
+        " partitions= " + partitionsEachInterval)
+
+    verify();
+
+    if (outputDirectory.exists()) {
+      System.err.println("ERROR - %s already exists: delete or specify another directory".format(outputDirectory))
+      System.exit(2)
+    }
+    startStream(intervalSecs,partitionsEachInterval,numTweetsToCollect);
+  }
+
+  def startStream(intervalSecs:Int, partitionsEachInterval:Int, numTweetsToCollect:Int) =
{
+    println("Initializing Streaming Spark Context...")
+
+    val conf = new SparkConf()
+      .setAppName(this.getClass.getSimpleName+""+System.currentTimeMillis())
+      .setMaster("local[2]")
+    val sc = new SparkContext(conf)
+    val ssc = new StreamingContext(sc, Seconds(intervalSecs))
+
+    val tweetStream = TwitterUtilsCtakes.createStream(
+      ssc,
+      Utils.getAuth,
+      Seq("medical"),
+      StorageLevel.MEMORY_ONLY)
+        .map(gson.toJson(_))
+        .filter(!_.contains("boundingBoxCoordinates"))//some kind of spark jira to fix this.
+
+    var checks = 0;
+    tweetStream.foreachRDD(rdd => {
+      val outputRDD = rdd.repartition(partitionsEachInterval)
+      System.out.println(rdd.count());
+      numTweetsCollected += rdd.count()
+      System.out.println("\n\n\n PROGRESS ::: "+numTweetsCollected + " so far, out of " +
numTweetsToCollect + " \n\n\n ");
+      if (numTweetsCollected > numTweetsToCollect) {
+          ssc.stop()
+          sc.stop();
+          System.exit(0)
+      }
+    })
+
+    /**
+     * This is where we invoke CTakes.  For your CTAkes implementation, you would change
the logic here
+     * to do something like store results to a file, or do a more sophisticated series of
tasks.
+     */
+    val stream = tweetStream.map(
+      x =>
+        System.out.println("processed :::::::::: " + CtakesTermAnalyzer.analyze(x)));
+
+    stream.print();
+    ssc.start()
+    ssc.awaitTermination()
+  }
+}
\ No newline at end of file

Propchange: ctakes/sandbox/src/main/scala/sparkapps/ctakes/CTakesTwitterStreamingApp.scala
------------------------------------------------------------------------------
    svn:executable = 

Added: ctakes/sandbox/src/main/scala/sparkapps/ctakes/Parser.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/ctakes/Parser.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/ctakes/Parser.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/ctakes/Parser.scala Wed Oct 29 15:13:47 2014
@@ -0,0 +1,52 @@
+package sparkapps.ctakes
+import scala.collection.immutable._;
+/**
+ * A simple utility for parsing arguments.
+ * As input, it takes the command line arguments.
+ * It returns a list of ordered args for the program.
+ * i.e. input is --a 1 --b 2 c 2---> output= 1 2 ...
+ * It also sets SystemProperties as well for any args which
+ * dont have switches (i.e. System.setProprety(c,2) will be called.)
+ */
+object Parser {
+
+  val CONSUMER_KEY = "consumerKey"
+  val CONSUMER_SECRET = "consumerSecret"
+  val ACCESS_TOKEN = "accessToken"
+  val ACCESS_TOKEN_SECRET = "accessTokenSecret"
+
+  type OptionMap = TreeMap[String,String]
+
+  //Returns a map of commandline options.
+  //for unknown options, sets them as system properties, so that
+  //arbitrary properties can be set.
+  def nextOption(map : OptionMap, list : List[String]): OptionMap = {
+    def isSwitch(s:String)=(s.substring(0,1).equals("--"))
+    list match {
+      case Nil => map
+      case "--outputDirectory" :: value :: tail => nextOption(map++Map("outputDirectory"->value),tail)
+      case "--numtweets" :: value :: tail => nextOption(map++Map("numtweets"->value),tail)
+      case "--intervals" :: value :: tail => nextOption(map++Map("interval"->value),tail)
+      case "--partitions" :: value :: tail => nextOption(map++Map("partitions"->value),tail)
+      case unknown :: value :: tail =>
+        System.out.println("Setting sys prop " + unknown + " "+value)
+        System.setProperty(unknown,value)
+        nextOption(map,tail)
+    }
+  }
+
+  /**
+   * Send in the arguments for this app to this method.  It will
+   * return the args in order.
+   */
+  def parse(args:Array[String]) : Array[String] = {
+    val iter = nextOption(new TreeMap(),args.toList).valuesIterator
+    val returnVal = Array(
+      iter.next(),
+      iter.next(),
+      iter.next(),
+      iter.next()) ;
+    System.out.println("Parsed inputs : " +returnVal)
+    returnVal
+  }
+}
\ No newline at end of file

Added: ctakes/sandbox/src/main/scala/sparkapps/ctakes/TermAnalyzer.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/ctakes/TermAnalyzer.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/ctakes/TermAnalyzer.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/ctakes/TermAnalyzer.scala Wed Oct 29 15:13:47
2014
@@ -0,0 +1,55 @@
+package sparkapps.ctakes
+
+import org.apache.ctakes.clinicalpipeline.ClinicalPipelineFactory.{RemoveEnclosedLookupWindows,
CopyNPChunksToLookupWindowAnnotations}
+import org.apache.ctakes.contexttokenizer.ae.ContextDependentTokenizerAnnotator
+import org.apache.ctakes.core.ae.{TokenizerAnnotatorPTB, SimpleSegmentAnnotator}
+import org.apache.ctakes.typesystem.`type`.syntax.BaseToken
+import org.apache.uima.analysis_engine.AnalysisEngine
+import org.apache.uima.jcas.JCas
+import org.uimafit.factory.{JCasFactory, AnalysisEngineFactory, AggregateBuilder}
+import org.uimafit.pipeline.SimplePipeline
+import org.uimafit.util.JCasUtil
+
+/**
+ *
+ * This is a module which analyzes terms
+ * and returns their metadata.  It is a wrapper
+ * to the CTakes library, used for spark
+ * streaming CTakes analysis of Tweets.
+ */
+object CtakesTermAnalyzer {
+
+  /**
+   * Simple pipeline.
+   */
+  def getDefaultPipeline():AnalysisEngine  = {
+    var builder = new AggregateBuilder
+    builder.add(SimpleSegmentAnnotator.createAnnotatorDescription());
+    builder.add(org.apache.ctakes.core.ae.SentenceDetector.createAnnotatorDescription());
+    builder.add(TokenizerAnnotatorPTB.createAnnotatorDescription());
+    builder.add(ContextDependentTokenizerAnnotator.createAnnotatorDescription());
+    builder.add(org.apache.ctakes.postagger.POSTagger.createAnnotatorDescription());
+    builder.add(org.apache.ctakes.chunker.ae.Chunker.createAnnotatorDescription());
+    builder.add(AnalysisEngineFactory.createPrimitiveDescription(classOf[CopyNPChunksToLookupWindowAnnotations]));
+    builder.add(AnalysisEngineFactory.createPrimitiveDescription(classOf[RemoveEnclosedLookupWindows]));
+    //builder.add(UmlsDictionaryLookupAnnotator.createAnnotatorDescription()); builder.add(PolarityCleartkAnalysisEngine.createAnnotatorDescription());
return builder.createAggregateDescription(); }
+    builder.createAggregate()
+  }
+
+
+  def analyze(text:String):Any = {
+    val aed:AnalysisEngine= getDefaultPipeline();
+    val jcas:JCas = JCasFactory.createJCas();
+    jcas.setDocumentText(text);
+    SimplePipeline.runPipeline(jcas, aed);
+    val iter = JCasUtil.select(jcas,classOf[BaseToken]).iterator()
+    while(iter.hasNext)
+    {
+      val entity = iter.next();
+      //for demonstration purposes , we print all this stuff.
+      System.out.print("---"+entity.getCoveredText + " " + entity.getPartOfSpeech+"---");
+    }
+    //return the iterator.
+    JCasUtil.select(jcas,classOf[BaseToken]).iterator()
+  }
+}

Added: ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterInputDStreamCTakes.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterInputDStreamCTakes.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterInputDStreamCTakes.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterInputDStreamCTakes.scala Wed Oct
29 15:13:47 2014
@@ -0,0 +1,144 @@
+package sparkapps.ctakes
+
+
+
+import java.nio.ByteBuffer
+import java.util.Date
+import java.util.concurrent.{Callable, FutureTask}
+
+import org.apache.ctakes.core.fsm.token.BaseToken
+import org.apache.uima.analysis_engine.AnalysisEngineDescription
+import org.apache.uima.jcas.JCas
+import org.uimafit.factory.JCasFactory
+import org.uimafit.pipeline.SimplePipeline
+import org.uimafit.util.JCasUtil
+import twitter4j._
+import twitter4j.auth.Authorization
+import twitter4j.conf.ConfigurationBuilder
+import twitter4j.auth.OAuthAuthorization
+
+import org.apache.spark.streaming._
+import org.apache.spark.streaming.dstream._
+import org.apache.spark.storage.StorageLevel
+import org.apache.spark.Logging
+import org.apache.spark.streaming.receiver.Receiver
+
+
+
+  /* A stream of Twitter statuses, potentially filtered by one or more keywords.
+  *
+  * @constructor create a new Twitter stream using the supplied Twitter4J authentication
credentials.
+  * An optional set of string filters can be used to restrict the set of tweets. The Twitter
API is
+  * such that this may return a sampled subset of all tweets during each interval.
+  *
+  * If no Authorization object is provided, initializes OAuth authorization using the system
+  * properties twitter4j.oauth.consumerKey, .consumerSecret, .accessToken and .accessTokenSecret.
+  */
+class TwitterInputDStreamCTakes(
+                             @transient ssc_ : StreamingContext,
+                             twitterAuth: Option[Authorization],
+                             filters: Seq[String],
+                             storageLevel: StorageLevel,
+                             slideSeconds: Int
+                             ) extends ReceiverInputDStream[Status](ssc_) {
+
+    override def slideDuration(): Duration = {
+      System.out.println("returning duration seconds = " + slideSeconds );
+      return Seconds(slideSeconds)
+    }
+
+    private def createOAuthAuthorization(): Authorization = {
+      new OAuthAuthorization(new ConfigurationBuilder().build())
+    }
+
+    private val authorization = twitterAuth.getOrElse(createOAuthAuthorization())
+
+    override def getReceiver(): Receiver[Status] = {
+      new TwitterReceiver(authorization, filters, storageLevel)
+    }
+  }
+
+  class TwitterReceiver(
+                         twitterAuth: Authorization,
+                         filters: Seq[String],
+                         storageLevel: StorageLevel
+                         ) extends Receiver[Status](storageLevel) with Logging {
+
+    @volatile private var twitterStream: TwitterStream = _
+    var total=0;
+    override def store(status:Status): Unit = {
+        super.store(status)
+    }
+
+    def statusListener():StatusListener = {
+      new StatusListener {
+        def onStatus(status: Status) = {
+            System.out.println("Tweet  : "+status.getText)
+            store(status)
+        }
+        def onDeletionNotice(statusDeletionNotice: StatusDeletionNotice) {}
+
+        def onTrackLimitationNotice(i: Int) {}
+
+        def onScrubGeo(l: Long, l1: Long) {}
+
+        def onStallWarning(stallWarning: StallWarning) {}
+
+        def onException(e: Exception) {
+          e.printStackTrace();
+          Thread.sleep(10000);
+          if(! stopped) {
+            restart("Error receiving tweets", e)
+          }
+        }
+      }
+    }
+    @volatile var stopped = false;
+    override def onStart()= {
+      logInfo("Wating 5 seconds to start to prevent abuse.")
+      Thread.sleep(5000)
+      stopped=false;
+      val future =
+        new Thread(
+          new Runnable() {
+            def run() = {
+              try {
+                System.out.println("Consumer k = " + System.getProperty("twitter4j.oauth.consumerKey"))
+                val newTwitterStream = new TwitterStreamFactory().getInstance(twitterAuth)
+                newTwitterStream.addListener(statusListener)
+
+                val query = new FilterQuery
+                if (filters.size > 0) {
+                  query.track(filters.toArray)
+                  newTwitterStream.filter(query)
+                }
+                else {
+                  newTwitterStream.sample()
+                }
+                setTwitterStream(newTwitterStream)
+
+                logInfo("Twitter receiver started")
+              }
+              catch {
+                case e: Exception =>
+                  restart("Error starting Twitter stream", e)
+              }
+            }
+          });
+
+      future.start();
+    }
+
+    def onStop() {
+      stopped=true;
+      setTwitterStream(null)
+    }
+
+    private def setTwitterStream(newTwitterStream: TwitterStream) = synchronized {
+      if (twitterStream != null) {
+        twitterStream.shutdown()
+      }
+      twitterStream = newTwitterStream
+    }
+  }
+

Added: ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterUtilsJ.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterUtilsJ.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterUtilsJ.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/ctakes/TwitterUtilsJ.scala Wed Oct 29 15:13:47
2014
@@ -0,0 +1,121 @@
+package sparkapps.ctakes
+
+import org.apache.spark.streaming.twitter.TwitterInputDStream
+import twitter4j.Status
+import twitter4j.auth.Authorization
+import org.apache.spark.storage.StorageLevel
+import org.apache.spark.streaming.StreamingContext
+import org.apache.spark.streaming.api.java.{JavaReceiverInputDStream, JavaDStream, JavaStreamingContext}
+import org.apache.spark.streaming.dstream.{ReceiverInputDStream, DStream}
+
+/**
+ * Moddified and borrowed from databricks spark tutorial.
+ */
+object TwitterUtilsCtakes {
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * @param ssc         StreamingContext object
+   * @param twitterAuth Twitter4J authentication, or None to use Twitter4J's default OAuth
+   *        authorization; this uses the system properties twitter4j.oauth.consumerKey,
+   *        twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
+   *        twitter4j.oauth.accessTokenSecret
+   * @param filters Set of filter strings to get only those tweets that match them
+   * @param storageLevel Storage level to use for storing the received objects
+   */
+  def createStream(
+                    ssc: StreamingContext,
+                    twitterAuth: Option[Authorization],
+                    filters: Seq[String] = Nil,
+                    storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK_SER_2
+                    ): ReceiverInputDStream[Status] =
+  {
+    //2 second slide duratoin
+    new TwitterInputDStreamCTakes(ssc, twitterAuth, filters, storageLevel, 2)
+
+  }
+
+  /**
+   * Create a input stream that returns tweets received from Twitter using Twitter4J's default
+   * OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
+   * twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
+   * twitter4j.oauth.accessTokenSecret.
+   * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
+   * @param jssc   JavaStreamingContext object
+   */
+  def createStream(jssc: JavaStreamingContext): JavaReceiverInputDStream[Status] = {
+    createStream(jssc.ssc, None)
+  }
+
+  /**
+   * Create a input stream that returns tweets received from Twitter using Twitter4J's default
+   * OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
+   * twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
+   * twitter4j.oauth.accessTokenSecret.
+   * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
+   * @param jssc    JavaStreamingContext object
+   * @param filters Set of filter strings to get only those tweets that match them
+   */
+  def createStream(jssc: JavaStreamingContext, filters: Array[String]
+                    ): JavaReceiverInputDStream[Status] = {
+    createStream(jssc.ssc, None, filters)
+  }
+
+  /**
+   * Create a input stream that returns tweets received from Twitter using Twitter4J's default
+   * OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
+   * twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
+   * twitter4j.oauth.accessTokenSecret.
+   * @param jssc         JavaStreamingContext object
+   * @param filters      Set of filter strings to get only those tweets that match them
+   * @param storageLevel Storage level to use for storing the received objects
+   */
+  def createStream(
+                    jssc: JavaStreamingContext,
+                    filters: Array[String],
+                    storageLevel: StorageLevel
+                    ): JavaReceiverInputDStream[Status] = {
+    createStream(jssc.ssc, None, filters, storageLevel)
+  }
+
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
+   * @param jssc        JavaStreamingContext object
+   * @param twitterAuth Twitter4J Authorization
+   */
+  def createStream(jssc: JavaStreamingContext, twitterAuth: Authorization
+                    ): JavaReceiverInputDStream[Status] = {
+    createStream(jssc.ssc, Some(twitterAuth))
+  }
+
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
+   * @param jssc        JavaStreamingContext object
+   * @param twitterAuth Twitter4J Authorization
+   * @param filters     Set of filter strings to get only those tweets that match them
+   */
+  def createStream(
+                    jssc: JavaStreamingContext,
+                    twitterAuth: Authorization,
+                    filters: Array[String]
+                    ): JavaReceiverInputDStream[Status] = {
+    createStream(jssc.ssc, Some(twitterAuth), filters)
+  }
+
+  /**
+   * Create a input stream that returns tweets received from Twitter.
+   * @param jssc         JavaStreamingContext object
+   * @param twitterAuth  Twitter4J Authorization object
+   * @param filters      Set of filter strings to get only those tweets that match them
+   * @param storageLevel Storage level to use for storing the received objects
+   */
+  def createStream(
+                    jssc: JavaStreamingContext,
+                    twitterAuth: Authorization,
+                    filters: Array[String],
+                    storageLevel: StorageLevel
+                    ): JavaReceiverInputDStream[Status] = {
+    createStream(jssc.ssc, Some(twitterAuth), filters, storageLevel)
+  }
+}

Added: ctakes/sandbox/src/main/scala/sparkapps/ctakes/Utils.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/main/scala/sparkapps/ctakes/Utils.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/main/scala/sparkapps/ctakes/Utils.scala (added)
+++ ctakes/sandbox/src/main/scala/sparkapps/ctakes/Utils.scala Wed Oct 29 15:13:47 2014
@@ -0,0 +1,124 @@
+package sparkapps.ctakes
+
+import java.util.Date
+
+import org.apache.commons.cli.{Options, ParseException, PosixParser}
+import org.apache.spark.mllib.linalg.Vector
+import org.apache.spark.mllib.feature.HashingTF
+import twitter4j._
+import twitter4j.auth.OAuthAuthorization
+import twitter4j.conf.ConfigurationBuilder
+
+/**
+ * Moddified and borrowed from databricks spark tutorial.
+ */
+object Utils {
+
+  /**
+   * Glue code for declarative testing of args. See impl example in Driver class.
+   * verifies that each input in the string passes the tester function.
+   * prints error and exists if not.
+   */
+  def checkpoint(
+                  tester : Any => Boolean,
+                  error : Any => Unit,
+                  inputs: List[String]): Unit = {
+    System.out.println("~~~~~~ Checkpoint ~~~~~")
+    def test(failures:Int, tests : List[String]):Boolean= {
+      tests match {
+        case List() => {
+          return failures == 0
+        }
+        case other => {
+          return test(failures + {
+            if (tester(tests.head)){0} else {
+              error(tests.head)
+              1
+            }
+          },
+          tests.tail)
+        }
+      }
+      tester(tests.head)
+    }
+   if ( ! test(0,inputs) )
+     System.exit(2)
+  }
+
+  def getAuth = {
+    Some(new OAuthAuthorization(new ConfigurationBuilder().build()))
+  }
+
+  object IntParam {
+    def unapply(str: String): Option[Int] = {
+      try {
+        Some(str.toInt)
+      } catch {
+        case e: NumberFormatException => None
+      }
+    }
+  }
+
+
+  /**
+   * A Mock status object for testing twitter streaming w/o
+   * actually connecting to twitter.
+   */
+  def mockStatus: Status = {
+    new Status {
+
+      override def getPlace: Place = ???
+
+      override def isRetweet: Boolean = ???
+
+      override def isFavorited: Boolean = ???
+
+      override def getCreatedAt: Date = ???
+
+      override def getUser: User = ???
+
+      override def getContributors: Array[Long] = ???
+
+      override def getRetweetedStatus: Status = ???
+
+      override def getInReplyToScreenName: String = "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------"
+
+      override def isTruncated: Boolean = ???
+
+      override def getId: Long = ???
+
+      override def getCurrentUserRetweetId: Long = ???
+
+      override def isPossiblySensitive: Boolean = ???
+
+      override def getRetweetCount: Long = ???
+
+      override def getGeoLocation: GeoLocation = ???
+
+      override def getInReplyToUserId: Long = ???
+
+      override def getSource: String = System.currentTimeMillis()+"SADFSADFASDFSDFSDFFffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff"
+
+      override def getText: String = "ASDFsdofaidsjofaisjdofiajsdofijadsfASDFsdofaidsjofaisjdofiajsdofijadsfASDFsdofaidsjofaisjdofiajsdofijadsfASDFsdofaidsjofaisjdofiajsdofijadsf"+System.currentTimeMillis()
+
+      override def getInReplyToStatusId: Long = ???
+
+      override def isRetweetedByMe: Boolean = ???
+
+      override def compareTo(p1: Status): Int = ???
+
+      override def getHashtagEntities: Array[HashtagEntity] = ???
+
+      override def getURLEntities: Array[URLEntity] = ???
+
+      override def getMediaEntities: Array[MediaEntity] = ???
+
+      override def getUserMentionEntities: Array[UserMentionEntity] = ???
+
+      override def getAccessLevel: Int = ???
+
+      override def getRateLimitStatus: RateLimitStatus = ???
+
+    }
+  }
+}

Added: ctakes/sandbox/src/test/scala/TestSpark.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/test/scala/TestSpark.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/test/scala/TestSpark.scala (added)
+++ ctakes/sandbox/src/test/scala/TestSpark.scala Wed Oct 29 15:13:47 2014
@@ -0,0 +1,16 @@
+import com.fasterxml.jackson.annotation.JsonUnwrapped
+import com.google.common.annotations.VisibleForTesting
+import sparkapps.SparkApp1
+
+/**
+ * Created by apache on 7/20/14.
+ */
+class TestSpark {
+
+  @org.junit.Test
+  def test(){
+    //very simple unit test.  we will improve it eventually.
+    SparkApp1.main(Array("1"));
+
+  }
+}

Propchange: ctakes/sandbox/src/test/scala/TestSpark.scala
------------------------------------------------------------------------------
    svn:executable = 

Added: ctakes/sandbox/src/test/scala/TestStreaming.scala
URL: http://svn.apache.org/viewvc/ctakes/sandbox/src/test/scala/TestStreaming.scala?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/src/test/scala/TestStreaming.scala (added)
+++ ctakes/sandbox/src/test/scala/TestStreaming.scala Wed Oct 29 15:13:47 2014
@@ -0,0 +1,7 @@
+/**
+ * Created by jay on 10/28/14.
+ */
+class TestStreaming {
+
+
+}

Added: ctakes/sandbox/twitter
URL: http://svn.apache.org/viewvc/ctakes/sandbox/twitter?rev=1635170&view=auto
==============================================================================
--- ctakes/sandbox/twitter (added)
+++ ctakes/sandbox/twitter Wed Oct 29 15:13:47 2014
@@ -0,0 +1,5 @@
+delete this line, and then place this file in /tmp/twitter after getting your tokens from
https://apps.twitter.com/app/
+consumerKey=xxxxxxxxxxxxxxxxxx
+consumerSecret=xxxxxxxxxxxxxxx
+accessToken=xxxxxxxxxxxxxxxxxxxxxxx
+accessTokenSecret=xxxxxxxxxxxxxxxxxxxxxxxxxxxxx



Mime
View raw message